AModel forTrendAnalysis in theOnline Shopping ScenarioUsing Multilevel Hesitation Pattern Mining

'e present paper proposes a new model for the exploration of hesitated patterns frommultiple levels of conceptual hierarchy in the transactional dataset.'e usual practice ofmining patterns has focused on identifying frequent patterns (i.e., which occur together) in the transactional dataset but uncovers the vital information about the patterns which are almost frequent (but not exactly frequent) called “hesitated patterns.” 'e proposed model uses the reduced minimum support threshold (contains two values: attractiveness and hesitation) and constant minimum confidence threshold with the top-down progressive deepening approach for generating patterns and utilizing the apriori property. To validate themodel, an online purchasing scenario of books through e-commerce-based online shopping platforms such as Amazon has been considered and shown that how the various factors contributed towards building hesitation to purchase a book at the time of purchasing. 'e present work suggests a novel way for deriving hesitated patterns frommultiple levels in the conceptual hierarchy with respect to the target dataset. Moreover, it is observed that the concepts and theories available in the existing related work Lu and Ng (2007) are only focusing on the introductory aspect of vague set theorybased hesitation association rule mining, which is not useful for handling the patterns from multiple levels of granularity, while the proposed model is complete in nature and addresses the very significant and untouched problem of mining “multilevel hesitated patterns” and is certainly useful for exploring the hesitated patterns from multiple levels of granularity based on the considered hesitation status in a transactional dataset.'ese hesitated patterns can be further utilized by decisionmakers and business analysts to build the strategy on how to increase the attraction level of such hesitated items (appeared in a particular transaction/set of transactions in a given dataset) to convert their state from hesitated to preferred items.


Introduction
In this constantly changing technological scenario, exploring a nugget of information from the transactional dataset is very much essential for the discovery of new patterns and association rules. e business community and decision makers, taking crucial decisions on the basis of explored "information" or "knowledge," will have a better chance of survival in this competitive world. Moreover, in the recent past, e-commerce industry has emerged as one of the most preferred option of shopping in the online mode, and the example includes Amazon, Flipkart, and Snapdeal. is has extended ease and convenience to the customers and, at the same time, resulted in competition among the service providers. Due to this, it has become essential to know something that nobody else knows in their business domain and make the difference. For this to happen, business houses and decision makers need to refer knowledge while doing crucial decision-making about products and promotional strategy planning for the growth of organization.
is is where the present research work focuses.
Here, the concern is to analyze the transactional dataset, where each transaction is a record of items (purchased or almost purchased) placed in the cart (fully or partially executed). e objective of analysis is to know the buying patterns of customers on the basis of their liking and disliking. As evident from the literature, the analytics act has been exercised to reveal various types of patterns such as Frequent Patterns [1][2][3][4][5], Profitable Patterns [6], Conditional Patterns [7], Calendar-Based Patterns [8], and Log Pattern Mining [9] using various techniques of pattern mining [10]. Moreover, after the success of mining knowledge from datasets, researchers deal with certain specific situations and perform various tasks such as mining on data streams [11,12], recognition of handwritten expression [13], investigating customer buying behavior through Visual Market Basket Analysis (VMBA) [14], automated assessment of shopping behavior [15,16], applying additional interestingness measures for association rule mining [17], and conditional discriminative pattern mining [18], and researchers also have to deal to improve the implementation of pattern mining algorithms using time stamp uncertainties and temporal constraints [19], privacy of frequent itemset mining using randomized response [20], and finding infrequent itemset to discover the negative association rule [21].
is work deals with Hesitation Information Mining [22] where the resultant will be in the form of patterns commonly termed Hesitated Patterns. Mining hesitated patterns are crucial for market basket analysis or online shopping scenario, where the retrieved patterns contain information about the items which are hesitated by the customers. Furthermore, the hesitated pattern is governed by some hesitation status [22], which works as a contributing state (or factor) for creating hesitation towards that item or itemsets (which constitutes hesitated pattern).
Related literature mentions vague set theory as an essential tool for generating vague association rules (VARs) [22][23][24][25][26] from the hesitated pattern set. Further, based on the currently available researchers and study in the field, it is concluded that mining of hesitated patterns at multiple levels of concept hierarchy (with different value of support threshold) is more sufficient and also helpful to expose the information from different levels of granularity.
is is referred as Multilevel Association Rule Mining [27][28][29][30], and particularly, in the context of present research, it is known as Multilevel Hesitated Pattern Mining. In case of traditional association pattern mining [1], support and confidence measures are two important factors, which play a crucial role for generating frequent patterns and further for identification of association patterns or rules. Instead of usually applied support and confidence measures, here the proposed model utilizes two new measures, namely, attractiveness support value and hesitation support value, where the attractiveness and hesitation mean that the item or product is sold nicely, and item is hesitated by customers, respectively. As mentioned earlier, this paper proposes a new model for hesitation mining with following objectives: (i) To mine hesitated patterns from multiple levels of concept hierarchy (ii) To discover hesitated association patterns or rules 1.1. Background. Lu and Ng [22] introduced the concept of vague association rule (VAR) mining. ey have handled vagueness and uncertainty using the concept of vague set theory. Lu et al. also coined some terminologies related to vague association rule (VAR) mining, as mentioned below: Intent. is shows different states of an item or itemset such as support (liking), against (disliking), and hesitation (unclarity).
Hesitation Status. e stage or reason at which items are being hesitated or dropped out by the customer.
Attractiveness. is value indicates that how nicely and frequently the item is purchased by the customers i.e., item is currently purchased by the customer and will also be purchased by the customer in the near future.

Hesitation.
is value indicates that how constantly customers are hesitated to purchase a particular item or set of items.
Various researchers have suggested methodologies, where they have shown the computation mechanism based on attractiveness and hesitation values associated with each item corresponding to the database of the assumed scenario. Further, based on attractiveness ad hesitation values, the AH pair database was constructed, which has been utilized to establish four types of relationship between two or more items, namely, A (Attractiveness), H (Hesitation), and AH (Attractiveness-Hesitation): it gives an attractive and hesitation relation between pair of items which is used further for identification of hesitated patterns, and HA (Hesitation-Attractiveness) gives an hesitation and attractive relation between pair of items. For these relationships, four types of support (attractiveness support, hesitation support, attractiveness hesitation support, and hesitation attractiveness support) and four types of confidence (A confidence, H confidence, AH confidence, and HA confidence) were defined. It is observed that a few researchers addressed this domain considering different datasets, with varied constraints. e work conducted by Pandey et al. [23] mentions the computing mechanism for mining vague association rules (VARs) for class course information from the temporal database. Another dimension has been explored by Badhe et al. [6] in the form of new model for mining profitable patterns from the transactional dataset. In the sequence, the work mentioned in [24,25] has presented genetic-based methodology for mining hesitated itemsets in the transactional dataset. In the recent research [26], authors have proposed elephant herding optimizationbased vague association rule mining. is work also makes use of transactional data with focus on seasonal effect, for finding maximum profit. Dandotiya et al. [31] proposed a method to identify the optimized hesitation pattern from the transactional dataset using the weighted apriori and genetic algorithm. Dixit et al. [32] proposed a model for mining hesitated patterns from the transactional datasets using vague set theory with considering only one hesitation state.
Literature reveals that no direct competing methods are available, and the related existing work [18] is just introducing the concept of vague set theory and other formulations for vague association rule mining. However, the present paper proposes a novel method with the complete mechanism to handle the information pertaining to hesitated patterns at multiple levels of granularity, which can be readily used by the knowledge workers, organizations, and business analysts for making strategies and planning for improving the attractiveness level of hesitated items or patterns. e present paper is organized as follows: in Section 2, a new model for mining hesitated patterns is described; Section 3 illustrates the concept of model with the suitable example of online shopping, while the outcome of this model is discussed in Section 4 followed by conclusion drawn.

Workflow Diagram of Proposed Model.
e proposed model will include a premining phase that will process the data from the data source by cleaning and transforming it into the input ready dataset i.e., multilevel transactional dataset (for P th Level) and then apply the multilevel hesitation mining algorithm; as a result of mining, hesitated patterns will be generated at each hesitation status, and this process will continue till the highest level of concept hierarchy. e generated pattern set will now be supplied to the hesitated association pattern module for generating interesting and potentially useful patterns, which will be used to facilitate the decision makers/knowledge workers for business strategy planning. e workflow diagram of the proposed model for multilevel hesitation pattern mining is shown in Figure 1.

Steps of Proposed Model.
ere can be n numbers of reasons, due to which the customer may hesitate to buy some products at the time of shopping, which may in turn probably result in the decrease of sale of those products or items. erefore, it is required to identify such type of hesitated products so that promotional strategy can be framed. is section describes the step-by-step procedure of the proposed model, which helps in the exploration of hesitated patterns or items (based on some considered hesitation status) from the transactional dataset. e steps involved to find hesitated patterns and generate rules are as follows: Step 1: e given transactional dataset (TD) contains a set of transactions (T j ) (where j � 1, 2, 3, . . . , m), and each transaction has a list of purchased and hesitated items. In this step, firstly, construct the concept hierarchy of the items present in the given dataset. en, the given transactional dataset is transformed into the multilevel transactional dataset (MD). In this dataset, each item in a transaction is represented as where Item_name refers to the item, and it is written in multilevel taxonomy, while the status of the item gives the information about whether the item is purchased or hesitated. If the item is purchased, then the status of the item value is 1, but if the item is hesitated by the customer, then its value is one of the hesitation status (h 1 , h 2 , . . . , h n , where h i ⊆h i+1 ) at which the customer is hesitated to purchase it.
Step 2: For finding hesitated frequent patterns at different levels in the hierarchy of the transactional dataset, a variable P is considered, where P ∈ 1, 2, 3, . . . , M { }, and this variable keeps track of the level number which is being processed.
is step encodes each item (either purchased or hesitated) which are present in the transaction of the multilevel transactional dataset. e encoding of each item is performed by using the sequence number of the item, which depends on the level in the hierarchy i.e., L i (where i � 1, 2, 3, . . . , M) and after considering the class replacing all the remaining numbers by the symbol " * " Step 3: Now, each item in an individual transaction is grouped according to its class, which depends on the level L i (where i � 1, 2, 3, . . . , M), and also, add their occurrences (due to which, the status value of the item, which is either purchased or hesitated, is changed) so that each grouped item in a transaction is in the form: (Item_name, status of item purchase, status of item hesitated), where the status of item purchase also means attractiveness of an item (s). is grouping is done in every transaction of the encoded multilevel transactional dataset individually.
Step 4: Consider another variable I, which is used to represent the length of the candidate pattern. It is represented as I-candidate pattern where I ∈ 1, 2, 3, . . . , r { }. For example, if the value of I is 1, then it is referred as 1-candidate pattern.
Step 5: For each level, it is necessary to define two value minimal threshold support (denoted as β P ) represented as where β p value represents the minimal threshold support for purchased item(s) or attractiveness value, while another β h value represents the minimal threshold support for hesitated item(s). is minimal threshold support will be taken as uniform or may be different for all the level in the hierarchy.
Step 6: Calculate the support of I-candidate pattern at each level P at different hesitation states h k � (1, 2, . . . , n). is support also contains two support values, and it is represented as where μ p represents the support value of the purchased item or attractiveness value and μ h represents the support value of the hesitated item. e support value of I-candidate pattern (x) at h k is computed as where x is a pattern, m is the total number of transactions in the dataset, k is the number of the hesitation state, v is the total number of times an item is purchased (attractive) in the transaction T i of level L i dataset, and η is the total number of times an item is purchased and hesitated in the transaction T i of level L i dataset: where ι is the number of times an item is hesitated in the transaction T i . So, the support value of I-candidate pattern (x) is In the normal form, by using equations (5)- (8), If the value of μ(x) ≥ β P , i.e., support value of the pattern is greater than equal to the minimal threshold support, it means the pattern is referred as the hesitated frequent pattern.
Step 7: Now, using the hesitated frequent patterns generated in the Step 6, construct (I + 1)-candidate patterns using the apriori candidate generation method [2][3][4], and their support is calculated as follows: where x and y are the two individual hesitated frequent patterns: So, by using equations (11) and (12), the support value is  Step 8: Now, repeat steps (2 to 7) at each level L i to mine hesitated frequent patterns. e process is continued till each level in the hierarchy is traversed.
Step 9: Predefined minimal confidence is represented as

Multilevel Hesitated Pattern
Algorithm. e steps involved in the multilevel hesitated pattern algorithm are given in Algorithm 1.

Computational Complexity.
e computational com- , where M is the highest level in the concept hierarchy, N is the maximum number of hesitated patterns in (i − 1) candidate pattern, and h k is the number of hesitation states or status.

Computational Complexity of Multilevel Hesitated
Pattern Algorithm. In the pseudocode, outer while-loop repeats maximum number of levels i.e., M; thus, it takes O (M) time. Now, first begin part inside while-loop calculate 1candidate patterns using the mathematical formula which takes constant time but, at each level and at every hesitation state, we calculate 1-candidate patterns so it takes O (M × h k ) time. After calculating (i)-candidate patterns (i.e., i � 1), (i + 1)-candidate patterns at each hesitation state are calculated, and pruning is performed in the inner while-loop, if it is considered that the maximum number of the hesitated pattern is N in (i − 1)-candidate patterns so this takes O (N 2 × h k ) time to generate (i + 1)-candidate patterns, and during pruning, it takes O (2 N ) time. erefore, the time complexity of the algorithm is

Illustration 1.
It is well known that a number of courses are part of computer science discipline. e example includes Programming in C, Object-Oriented Programming, Data Structures, eory of Computation, Operating Systems, and Database Management System. Further, to study and gain the knowledge about these courses, students have to refer some reference books. erefore, they may purchase these reference books through the online mode or in the traditional mode.
In this illustration, the online purchasing scenario of reference books is considered, and courses relating to computer science discipline have been assumed, which includes Programming in C, Data Structures, and Analysis & Design of Algorithms. Moreover, it is also considered that, for a specific course, several reference books are available. ese books differ from one another in various aspects, such as content, publisher, and author; with this scenario, a concept hierarch is developed, as shown in Figure 2.
During the online purchasing process, the customer might hesitate to purchase books due to some reasons (hesitation status). ese conditions may be In this illustration, all these abovementioned reasons are considered for the formulation of hesitation status h 1 , h 2 , and h 3 , respectively.
Hence, the objective is to explore or find frequently hesitated books (due to any of the described hesitation status). e proposed model is applied on the considered dataset. e step-by-step procedure of the model is as follows: Step 1: Let us consider a transactional dataset (TD), which contains ten numbers of transactions (T j ), namely, (T 1 , T 2 , T 3 , T 4 , T 5 , T 6 , T 7 , T 8 , T 9 , and T 10 ) and three hesitation status (h 1 , h 2 , h 3 ). e transactional dataset is shown in Table 1.
In multilevel taxonomy, items  Table 2.
Step 2: e items present in the hierarchy are encoded in this step. e concept hierarchy, as shown in Figure 1, contains reference books for computer science discipline; as a root node, it is referred as level 0. However, Programming in C, Data

Mathematical Problems in Engineering
Input: transactional dataset, minimum threshold support, minimum threshold confidence, number of hesitation status. Output: hesitated patterns, hesitated association patterns TD: Initial Transactional Dataset MD: Multilevel Transactional Dataset//after transforming TD into multilevel taxonomy M: highest level in the concept hierarchy//input P: store the currently processing level CP i : candidate pattern of size i//i � 1, 2, . . . , t HP i : hesitated Patterns of size i//i � 1, 2, . . . , t β P : minimal threshold support as (β p , β h ) //different for each level in the hierarchy //β p is the attractiveness support and β h is the hesitation support of an itemset. α � minimal threshold confidence as (α p , α h ) //α p is the attractiveness confidence and α h is the hesitation confidence of an itemset. h k : hesitation status//k � 1, 2, . . . , n Initialize: P � 1 While (P! � M) do begin //for each class at each hesitation status , (] T j (y)/η T j )))) Where x and y are the two individual hesitated frequent patterns. Step 3: e model will traverse all the level one by one. In the considered example, the hierarchy has two levels for traversing (level 0 is not considered) i.e., P � 1 and P � 2. For level 1, group the items present in the individual transaction of MD. After grouping, the modified multilevel transactional dataset for level 1 is transformed into new layout, which is shown in Table 3.
Step 4: e next task after grouping the items is to find hesitated patterns, and these patterns have some length which is denoted by I. Now, the procedure for mining hesitated frequent patterns at various levels over all hesitation status is described in the step 5, 6, and 8.

1-candidate patterns:
In the concept hierarchy, there are three numbers of 1-candidate patterns that are present i.e., {1 * }, {2 * }, and {3 * }, and the dataset has three hesitation status h 1 , h 2 , and h 3 . So, the support of each pattern at every hesitation status is calculated (using equation (10) Figure 2: Concept hierarchy of reference books for computer science.  e support of each 1-candidate patterns with their support is shown in Table 4. Now, compare the support of every candidate patterns with the minimal threshold support (β).
ose patterns whose support is greater than or equal to minimal threshold support (i.e., for this level is (0.80, 0.50)) are referred as hesitated frequent patterns and are shown in Table 5.
Step 7: Using these hesitated frequent patterns, 2-candidate patterns are generated by using the concept of the apriori candidate generation method [2,3] (this method is applicable only on h 2 hesitation status because, for pairing, a sufficient number of patterns are available only in this hesitation status). After applying this method, the result will generate in the form as follows: (1 * , 2 * ), (1 * , 3 * ), and (2 * , 3 * ). Now, For I = 2,

2-candidate pattern:
Calculate the support of these generated 2candidate patterns at h 2 hesitation status by using equation (13). us, the 2-candidate patterns support is compared with minimal threshold support. e 2-candidate patterns, which are hesitated frequent patterns, are shown in Table 6. Using these generated hesitated frequent patterns, generate 3-candidate patterns. So, the 3candidate pattern generated is {1 * , 2 * , 3 * }.

3-candidate pattern:
Now, calculate the support (by using equation (13)) for this generated pattern and compare it with the predefined support. e support value is greater or equal to the minimal threshold support. So, it is a hesitated frequent pattern. No further candidate pattern is generated. e process will stop at this level and move to the next level in the hierarchy. Now, repeat steps (2-7) to calculate hesitated frequent patterns, for Level P = 2; after encoding of items according to level 2, the dataset is updated, as shown in Table 7.
Step 9: Calculate the confidence of all hesitated frequent patterns, which are mined at each level by using equation (14), Consider that the minimal threshold confidence is (0.60, 0.45).

Illustration 2.
Similar to illustration 1, another online purchasing scenario (with relatively large concept hierarchy) of grocery items (along with the items such as rice, flours, masala, and oil) can be considered. Moreover, the type and brand for a specific grocery item can also be considered. ese items may differ from one another in their price, quantity, quality, etc. Considering this scenario, the concept hierarchy that will be developed is shown in Figure 3. In order to explore the hesitated items, the procedure depicted in illustration 1 is to be applied.

Discussion and Results
e proposed model is competent enough to explore hesitated patterns from multiple levels in the concept hierarchy related to the target dataset. It is observed that the proposed model is generating the hesitated patterns as per the expectation, but the results can also be analyzed in terms of quantitative and qualitative dimensions. If the generated results are considered along with the quantitative dimension, then the proposed model is effectively generating all the patterns (that is, completeness). e model is complete in nature because it generates all hesitated patterns (of all sizes) at each level of granularity. Hence, the produced results are covering all the pattern set which shows the sufficiency of the model from the quantitative point of view.
In the present work, certain hesitation status has been considered for validating the proposed model. It is observed that the model is producing quality results (in terms of accuracy) which are dependent on the considered hesitation status. It is realized that the inclusion of more hesitation status (by means of various ways such as surveying, buying behavior analysis of customers, experience,    (12,   and common intelligence of knowledge workers) may further help in improving the quality of generated hesitated patterns. e results show that the model is revealing the hesitated pattern set from multiple levels of granularity with the desired level of quality. Further, the quality is very concerned with the considered hesitation status, and the same may be improved by taking into account more hesitation status during the computation process of the proposed model. Moreover, the quality can also be improved by associating appropriate choose optimization mechanism at the postmining stage to refine the generated hesitated pattern set on the basis of various interestingness factors.
Along with the qualitative and quantitative aspects, the parallel implementation of the proposed model (for larger size catalogs) can be achieved by exploring the hesitated patterns with respect to every hesitation status at each level of the concept hierarchy in a distributed manner i.e., levelwise hesitated patterns can be calculated on separate machines, that is, the exploration of level 1 hesitated patterns on machine 1, level 2 hesitated patterns on machine 2, and so on, and aggregate the result in the form of the global hesitated pattern set. e proposed model is applied on the considered transactional dataset of the online book purchasing scenario through e-commerce platforms such as Amazon and Flipkart. As a consequence, hesitated frequent patterns are generated as follows: {1 * }, {2 * }, {3 * }, {1 * , 2 * }, {1 * , 3 * }, {2 * , 3 * }, {1 * , 2 * , 3 * }, {22}, {31}, {32}, and {22, 31} at h 2 (hesitation status). Subsequently, the hesitated association patterns or rules discovered from the hesitated patterns include (1 * �> 3 * ), (2 * �> 3 * ), (1 * , 2 * �> 3 * ), and (22 �> 31). eses hesitated patterns and association patterns can be interpreted in the following manner. e association pattern {22 �> 31}, implies that {Data Structures, Algorithm and Application in C++ (Universities Press) �> Fundamentals of Computer Algorithms (Universities Press)} books are associated with each other i.e., hesitated by most of the customer. Particularly, this association pattern shows the certainty of hesitation of other book titles (right side of the rule or pattern set), when the book title (left side of the rule or pattern set) is hesitated. is is because of the hesitation status; content of the books is not much more different. Based on this hesitation information, the attractiveness of hesitated patterns or hesitated association pattern sets can be increased. In this way, organizations and business houses may plan their promotional strategies.

Conclusions
is work presents a new model for exploration and discovery of the hesitated pattern set from the transactional data relating to the online shopping scenario. e model is effective and useful for generating hesitated patterns, which can be further utilized for crucial decision-making purposes within an organization. is will enable the organization and business houses to survive in this competitive age. Using the proposed model, hesitated patterns can be identified and considered for turning hesitated items into the preferred ones (by improving the attractiveness value). Moreover, the proposed model is capable in handling hesitated information from different levels of granularity, which reveals the effectiveness in the generated hesitated patterns. However, when the dataset increases, then the large number of hesitated patterns will be generated. is will consume lots of processing time and may result in the degradation of the efficiency of algorithms. Further, to handle this situation, one possible way is to make use of appropriately chosen optimization mechanism [33][34][35].

Data Availability
No data are available. However, for the purpose of modeling, the data were assumed on the basis of the current online shopping scenario.  Figure 3: Concept hierarchy of grocery items.