Opinion Mining-Based Term Extraction Sentiment Classification Modeling

The spread of social media has accelerated the formation and dissemination of user review data, which contain subjective opinions of users on products, in an e-commerce environment. Because these reviews signi ﬁ cantly in ﬂ uence other users, opinion mining has garnered substantial attention in analyzing the positive and negative opinions of users and deriving solutions based on these analytical results. Terms that include sentimental information and used in user reviews serve as the most crucial element in sentimental classi ﬁ cation. In this regard, it is crucial to distinguish the most in ﬂ uential terms in user reviews. This study proposed a document-level sentiment classi ﬁ cation model based on the collection and application of user reviews generated in an e-commerce environment. Here, a term information extraction method was applied to the proposed model to select core terms, classify the selected terms according to parts of speech (POS), determine terms that can increase information power and in ﬂ uence, and adopt these terms in opinion mining research, based on SVM, SVM+, and SVM+MTL techniques. The results obtained from evaluating the proposed model indicate that it exhibited excellent sentiment analysis performance. The proposed model is expected to be e ﬀ ectively utilized in providing enhanced services for users and increasing competitiveness in the e-commerce environment.


Introduction
The development of information and communications technology has accelerated the advancement of social media, which facilitates communications among users, accumulates an enormous amount of knowledge, and enables users to easily exchange knowledge in online environments. Social media have provided online users with the ability to obtain and analyze data beyond the limitations of time and space. Users who purchase products in e-commerce environments often write reviews on these products, which serve as crucial sources of information for other online users who might be interested in purchasing products related to these reviews. In other words, these reviews are utilized as essential information for providing enhanced services to e-commerce users. However, owing to the rapid increase in the number of online reviews due to big data, it has become difficult to obtain useful information by analyzing these reviews. Hence, opinion mining based on text mining has garnered signifi-cant attention, as this technique can present the results obtained from analyzing sentiments included in a considerable amount of user reviews [1][2][3].
Previous studies on opinion mining applied to online reviews primarily focus on classifying the reviews or opinions of users on products generated in an e-commerce environment, according to sentimental polarity (positive or negative sentiments) [4][5][6]. Certain online reviews might provide users with incorrect information, such as advertisements or intentionally distorted information. Therefore, research on opinion mining should utilize core terms that represent the subjective opinions of users in the reviews. In other words, it is necessary to extract core terms, which function as the most significant elements in sentiment classification, from the entire terms included in reviews and determine core terms that can represent documents. Hence, numerous studies have been conducted on opinion mining based on term information extraction [7][8][9]. Because the roles of terms including sentimental information perform differently according to POS in these studies, such roles should be analyzed according to POS [10]. Nouns tend to reflect properties and sentiments related to products, while adjectives and verbs tend to include information on the subjective opinions and evaluation of users. In addition, adverbs are applied as various types of expression methods and modification functions in the process of document classification [11,12].
Existing studies have also adopted data mining techniques that can predict the sentiments of users by using text documents to analyze their sentiments based on opinion mining from a different perspective [13][14][15]. Although several researchers have utilized data mining techniques to derive enhanced prediction performance, they have failed in addressing the inherent limitations of these techniques. To address this problem, a number of studies improved prediction performance by integrating data mining techniques. Increasing attention has also been paid to a research, which proposed an integrated model considering the characteristics of each data mining technique to achieve outstanding prediction performance instead of performing the duplicated application of a simple model.
The objectives of this study are outlined as follows: First, in sentence-level analysis, a number of studies have been carried out that utilized various parts of speech to determine the main part of speech (POS) and conduct sentiment analysis. In particular, based on the approach of semantic analysis, adjectives bear the most sentiment information among different POS, and they are used as the core terms for classification of polarity. However, in a document-level analysis, optimal selection of terms is instrumental since selection of input variables determines the predictive performance of the model and has a significant impact on the classification of the sentiment of the entire document. Also, it has been reported that considering the entire document using various POS is more effective than considering only adjectives in a fragmented approach. Therefore, in this study, terms are extracted focusing on the four POS (adjectives, adverbs, verbs, and nouns) and the extracted terms are used as input variables for a model for document-level sentiment classification. Also, considering the superiority of adjectives in terms of sentiment bearing, we aim to examine whether there are differences in the relationship between the four POS and the prediction performance of different model types.
Second, in the document-level analysis, using an optimal number of terms as input variables for the model is important for effective classification of the document. If too many terms are used as input variables, it is highly likely that there will be missing values and high redundancy; thus, it is difficult to determine if sentiment information is included, and also, the efficiency of the analysis is reduced. Therefore, in this study, a variety of term information extraction techniques are used in order to select effective and efficient terms as input variables to be used in the model and compare the results and performance between different techniques.
Third, in opinion mining research, when data mining techniques are used, the selection of the variables used in the model and the method applied according to the characteristics of the data plays the central role. In addition, since different results are obtained depending on the use of vari-ous data mining techniques, the model performance may vary. That is, it is important to explore and select the most efficient model by adopting various data mining techniques. Therefore, in this study, SVM+ and SVM+MTL techniques, which have not been implemented in opinion mining research to date, are applied, and models for documentlevel sentiment classification are developed to compare the prediction performance of the developed models.
Lastly, in opinion mining research, different single models are combined to develop a new integrated model, which serves as a method for further improvement of the predictive performance of a single model. In general, an integrated model is applied with various methods to improve the performance of a single model, and a number of studies have been carried out on the development and application of the integrated models. Thus, in this study, through the results of the models for document-level sentiment classification using SVM, SVM +, and SVM+MTL techniques, we propose an integrated data mining model, a type of an integrated model, and comparatively analyze the difference in predictive performance between the single model and the integrated model.
To this end, core terms were selected by adopting a term information extraction technique, and then, the selected terms were classified according to POS to determine the final terms that can increase information power and influence. Subsequently, approach presented an integrated sentiment classification model by applying SVM+ and SVM+MTL techniques to the research on opinion mining. To evaluate the performance of the proposed model, this study collected 80 000 user reviews on movies, games, music, and books on Amazon (http:// Amazon.com), eliminated unnecessary terms, and extracted terms based on POS tagging. It also calculated the values of document frequency, term frequency-inverse document frequency (TF-IDF), information gain, and chi-square statistic to rank the extracted terms based on the values derived. Subsequently, the top 20 optimal terms were determined according to the categories and classified sentiments included in the optimal terms selected by utilizing SVM, SVM+, and SVM+MTL techniques. The obtained evaluation results indicate that the sentiment classification model based on the chi-square statistic exhibited the most excellent prediction performance among the other models. In addition, the proposed sentimental classification model effectively analyzed sentiments included in terms. It is expected that the proposed term-based sentiment classification model can be effectively used to improve services, ensure competitiveness, and provide enhanced services for users in the e-commerce environment. The remainder of this paper is presented as follows: Section 2 reviews previous studies, while Section 3 introduces the proposed sentiment classification model based on terms at the document level. Furthermore, Section 4 presents the results obtained from conducting experiments on the proposed model. Finally, Section 5 provides conclusions and future research directions.

Related Research
2.1. Opinion Mining for Sentiment Classification. Opinion mining, which was developed from a data mining technique for document classification, refers to the process of extracting, classifying, and analyzing the opinions of users expressed in various media [16]. This technique, also called sentiment analysis, analyzes sentiments expressed in text and a state of positive or negative expressions on a certain subject included in the text. Owing to such characteristics, the field of opinion mining for social network analysis has garnered increasing interests [17,18]. Sentiment analysis generally determines the positive or negative tendency of an entire document by comprehensively analyzing opinions included in the document. To achieve this objective, the following analyses are performed at two different levels [19]: First, a document-level analysis is conducted to classify documents that express positive or negative opinions or sentiments. In this analysis, opinions on a certain subject are regarded as the basic unit of information. This analysis facilitates fast and comprehensive determination, as certain terms included in a document are selected according to frequency. Second, a sentence-level analysis is conducted to classify sentiments expressed in sentences that are included in a document. This analysis enables people to determine whether a positive or negative opinion is expressed, based on sentences in the document. Specifically, this analysis technique determines an opinion based on sentences, extracts sentences including sentiments, and examines paragraphs or phrases, considering different terms [20].
Sentiment classification is the process of extracting expressional terms in a document or a sentence from reviews or comments, identifying sentiments in various forms of documents, and classifying these sentiments according to polarity (positive or negative sentiments) [21]. Studies on sentiment classification adopt an approach based on vocabulary, machine learning (ML), or a method integrating vocabulary and ML [19]. In the vocabulary-based approach, a model analyzes the intentions of users based on positive and negative terms and the implications of these terms. In this process, it primarily adopts previously edited or wellknown sentiment terms [22,23]. In the ML-based approach, a model trains behavior or pattern data based on data collected and then predicts the future by classifying or analyzing these data. This process applies algorithms and linguistic variables used in the ML-based approach and depends on algorithms to solve document classification problems related to sentiment analysis. In the integrated approach, a model adopts both the vocabulary-and MLbased approaches. In general, sentiment terms serve as crucial elements in this approach. Figure 1 illustrates various approach techniques for sentiment classification.

Term Information Extraction
Technique. Term information extraction is the most significant methodology for determining terms that contain the greatest amount of information on document classification and establishing a classification model to solve sentiment classification problems at the document level in opinion mining research [24,25]. Terms used in reviews or comments contain sufficient information that can be adopted to classify linguistic properties and documents, thereby performing a crucial role in sentiment classification, and serving as an important basis for determining the main sentiment in a review. It is crucial to determine core terms, as these terms exert significant influence on determining the polarity of the entire document.
2.2.1. Document Frequency. Document frequency refers to the number of documents that include certain terms among the entire documents. In other words, it is a ratio of documents that contain terms that were used over a certain frequency in documents. This technique is simple and requires less calculation than other techniques. It also improves the accuracy of document categorization by eliminating low-frequency terms and discarding words under the assumption that terms that occur at low frequency do not contribute to document classification [26,27]. Equation (1) defines document frequency, where N and i denote the entire number of documents and terms, respectively.
2.2.2. Information Gain. Information gain is one of the most frequently used techniques in opinion mining research, which is known to be more influential than document frequency. This technique calculates the amount of information included in terms according to categories by considering both the occurrence and absence frequencies of terms, respectively. It is applied as an evaluation standard for measuring the usefulness of terms in the ML field. Information gain is calculated via the following process. First, a set of the entire documents is divided into subsets. Second, entropy, which changes before and after the use of a certain term, is calculated. Third, the probability of a target is divided and calculated according to categories, to derive an information value on the certain terms. Fourth, the gain is calculated to derive normalized gain [28,29]. Equation (2) defines information gain, where t denotes a term.
where t p is the number of times the term t occurs in the document and t p is the number of times the term t does not occur in the document.

TF-IDF.
TF-IDF is a method that can represent the significance of a term in individual documents. This method is frequently adopted because of its advantages, which includes its similarity with document frequency, simple calculation processes, and excellent performance [30]. The significance of a term is proportional to the number of times the term appears in a document and inversely proportional to the number of entire documents that contain the term [31,32]. The TF-IDF technique was developed as a document representation method to evaluate the relative significance of terms in a document. This technique is frequently used in analyses based on the unit of a phrase or a paragraph [33]. It calculates the significance of a term by applying a weight to the term and employing TF, which refers to the number of times a word appears in a document, and DF, which is the number of the entire documents containing the term. Equations (3) and (4) express the TF-IDF calculation processes based on a term (t) and document (d). where where N denotes the total number of documents, TF t i d the number of documents (d) and terms (t) that appear, and DF t the total number of documents containing the term (t).
TF alone cannot be used to calculate the significance of a term because a high-frequency term is likely to be meaningless in a document. Consequently, both TF and IDF are reflected in the calculation process. IDF divides the number of the entire documents by the number of documents that contain the given term. Therefore, a term that appears in numerous documents has a low IDF value. In contrast, a term that is biased to certain documents has a high IDF value.

2.2.4.
Chi-Square Statistic. The chi-square statistic, which analyzes a correlation between categorical variables, is the most frequently used technique among cross-tabulation analysis techniques. Specifically, the value of a correlation is "1" if a term appears more than once in the entire documents; otherwise, it is "0." In other words, this technique is used to identify the significance between a term ðt i Þ and a category ðC j Þ.
where N = A + B + C + D (N the total number of documents), i = 1, 2, ⋯, n (n the total number of terms used), j = 1, 2, ⋯, m (m the total number of categories).
The chi-square statistic is similar to mutual information; however, the former exhibits better performance for term extraction than the latter, owing to the standardized value of the former. Equation (5) defines the chi-square statistic ðx 2 Þ between C j and t i in a range of the consistent properties applied. Table 1 presents the document frequency between C j and t i [34,35].

Data Mining Techniques
2.3.1. SVM. The SVM technique, which was developed by Vapnik based on the statistical learning theory, exhibits outstanding performance in solving classification problems [36]. While most learning algorithms focus on empirical risk minimization, this technique aims at structural risk minimization. Because this technique can easily generalize classification problems, it has been applied in various fields. The SVM technique can also match nonlinear problems on input space with linear problems on specific highdimensional space, thereby facilitating convenient mathematical analysis [37,38]. Kernel functions generally used for the SVM technique include the polynomial kernel and Gaussian radial basis function (RBF). These functions form input datasets that can be linearly separated in specific space by mapping data in high-dimensional space [39]. Figure 2 illustrates the decision boundary and margin of the SVM technique.

SVM+.
Vapnik also proposed SVM+, which is an optimal SVM-based technique for managing learning with structured data (LWSD) or learning using privileged information (LUPI) to obtain hidden group data from training data [36]. In addition, Vapnik defined a slack variable in each group based on a modification function by considering group data and then presented a method of mapping input vectors in two different Hilbert spaces simultaneously. Figure 3 presents the mapping method.
The slack variable is restricted by a correction variable under the applied SVM+ environment. A mapping sample in a modification space should be placed on one side of a correspondence function. The SVM+ technique simultaneously maps data in both decision and modification spaces. When a  Mobile Information Systems slack modification function is defined in the same decision space, a decision function is defined in the decision space. Data in other groups are mapped in the same decision space, whereas a modification function can be defined in the same modification space or a different modification space, owing to the application of different correction variables in different groups [40]. Equations (6) and (7) define the SVM+ technique: Subject to: R denotes the number of groups, while y adjusts the capacity of modification space and importance of capacities of the modification space. ξ r represents the slack variables of each group and differs according to groups. A penalty coefficient (C) maintains a balance between the complexity of a model     (i.e., training capability) and the error rate of a sample. That is, the SVM+ technique can be divided into two parts, where W denotes the capability of the decision space and W r denotes the capability of the crystal space. However, since W r does not determine the size of the margin, the SVM+ technique adds ðr/2ÞkW r k 2 compared to the SVM technique.
The specific term information extraction processes of the SVM+ technique are consistent with those of the SVM technique. In the term information extraction process, special attention should be paid to the following issues: ( Ribeiro et al. adopted the SVM+ method as a model for predicting corporate bankruptcy based on financial data obtained in France and then examined the prediction performance of the model by considering authority information [41]. Specifically, they analyzed the prediction performance of the corporate bankruptcy prediction model by using 6 Mobile Information Systems heterogeneous information grouped according to the size categories of 30 companies relative to financial ratios, number of employees, and annual profits. Based on the results obtained from their analyses, they argued that the SVM+ technique always exhibits a more optimal performance than the SVM and SVM+MTL techniques, regardless of kernel functions. Moreover, the corporate bankruptcy prediction model that applies the Gaussian RBF as a kernel function exhibited better performance than the model that applies a linear kernel function based on the application of structured information in training data. Serra-Toro et al. elucidated the relationship between LUPI and the SVM+ technique, thereby indicating that the performance of this technique depends on a subtle relationship between normal data and authority information [42]. They also stated that a randomly generated variable can also perform a crucial role when it contains authority information on a specific problem.
In addition, they argued that the performance of the SVM+ technique can exceed that of the SVM technique according to data preprocessing, data set segmentation, validation protocols, experiments, and a range of parameters and search procedures.

SVM+MTL. Liang et al. developed an SVM+MTL
technique by combining the SVM+ algorithm and the multitask learning (MLT) technique. Specifically, the SVM+ technique was adopted to define different decision functions for each group and identify a relationship between groups [43]. To solve problems related to the MTL technique, the SVM+MTL technique also maps data in two Hilbert spaces simultaneously, similar to the SVM+ technique. Equation (10) defines the decision function.
The SVM+MTL technique has the following advantages over the SVM+ technique: (1) Group data also exist in prediction data (2) A slack variable indicates an error in the entire model after a modification function is added, instead of being defined by the modification function (3) A decision function has a correction term unlike those used for the SVM and SVM+ techniques Liang et al. proposed an MTL algorithm for the SVM +MTL technique. This study utilized group data to solve MTL-related problems by applying the SVM+ technique [43]. Via this process, the prediction accuracy of the proposed model increased. Moreover, this study elucidated the relationships between SVM+ and SVM+MTL techniques and compared the performance of sentiment classification models that apply these techniques by using comprehensive datasets to empirically compare the performance of these models. The results obtained from the comparison indicated that the sentiment classification model that applies the XVM +MTL technique exhibited excellent prediction performance when the amount of data was sufficient. However, when the  Mobile Information Systems amount of data was insufficient, sentiment classification models that apply the SVM or SVM+ technique exhibited outstanding performance. Figure 4 presents processing using the SVM+MTL..

Sentiment Classification Modeling
3.1. Sentiment Classification Model. The sentiment classification model proposed in this study determines and classifies core terms, which contain a significant amount of information and influence the entire document, for prediction. Figure 5 illustrates the three stages required to establish a term-based sentiment classification model at the document level, which include the formation of a term database (DB), selection of terms, and establishment of a sentiment classification model.

Processes of a Term DB Formation.
The processes of a term DB formation are as follows: First, customer review data generated on the web are crawled, collected, and stored in a customer review DB.
Second, these data are preprocessed, owing to the unstructured data. In general, unstructured data obtained from the web should be converted to structured data to be used in an experiment. In other words, a customer review can accurately reflect a subjective opinion of an individual only when this review is considered as a document.
Third, structured data obtained via preprocessing are adopted to classify reviews as positive and negative reviews.

Mobile Information Systems
This classification process is required to identify positive or negative tendencies of customers in their opinions. Hence, this study developed a simple application program based on the Microsoft foundation class (MFC) to document individual customer reviews. Fourth, the entire terms included in the documents are extracted. The extracted terms contain stop words, which appear frequently, exhibit weak information power, and do not provide special information. Hence, it is necessary to eliminate such stop words. Stop words that appear frequently in a document consist of articles (e.g., a, an, and the), conjunctions (e.g., that, and, and when), pronouns (e.g., I and you), and the verb to be (e.g., is and are). The process of eliminating stop words is performed to select highly influential terms, develop a term DB based on the selected terms, and store the DB.

Term
Selection. Subsequently, four POS (adjectives, adverbs, verbs, and nouns) are utilized to tag the terms stored in the term DB. Information on POS serves as a crucial index for deriving sentimental information from terms and can also be effectively used to determine terms with significant information power and influence by accurately analyzing the characteristics of terms. However, when POS tagging is not performed, the term-selection process requires numerous manual tasks, including term selection based on the opinions of experts and a considerable amount of cost and time.
In this study, an experiment was conducted in which a Stanford POS Tagger program, a grammatical structure analysis tool, was used to apply 4 POS tagging to the entire terms and measure the frequency of occurrence of these terms in the entire documents. Subsequently, term information extraction techniques were adopted to calculate term information values. Document frequency, TF-IDF, information gain, and chi-square statistic were applied as term information extraction techniques in this study. After the information values of each term were calculated, these terms were ranked according to term information extraction techniques to determine the terms to be used as input variables and store the selected terms in a sentiment terms for documents database (STFD).

Sentiment Classification Model Formation.
The selected terms presented in Figure 6 are adopted as input variables, and the dataset including these terms is divided into three different datasets for training, evaluation, and verification. Next, SVM, SVM+, and SVM+MTL techniques for term information extraction are applied to establish document-level sentiment classification models.

Term-Based Sentiment Classification
Model. The termbased sentiment classification model proposed in this study is an integrated model that exhibits better performance than individual sentiment classification models, beyond the limitations of these models. Figure 6 presents the integrated sentiment classification model based on the SVM+MTL model, which reflects the experimental results of individual sentiment classification models based on the aforementioned term information extraction techniques, to exhibit more excellent sentiment analysis performance. The integrated model based on the SVM+MTL technique utilizes and trains group data as probability values (prediction performance rates) obtained in the experimental results on individual sentiment classification models. Figure 7 presents the processes of establishing the integrated sentiment classification model proposed in this study. These processes are as follows: (1) The experimental results based on sentiment classification models applying SVM, SVM+, and SVM +MTL techniques are derived, as illustrated in Figure 6 (2) The model, which exhibited the best result among sentiment classification models that apply the SVM +MTL technique, is selected (3) Group data used for the SVM+MTL technique are used as probability values for the remaining three models, except for the best model (4) Models are divided into specific models with group data applied in various ways for training, as illustrated in Figure 8 (5) Prediction results derived by the sentiment classification model are applied to the decision-making processes When A, B, C, and D refer to document frequency, TF-IDF, information gain, and the chi-square statistic, respectively, and a model based on A exhibits the best performance in the experimental results based on sentiment classification models that apply the SVM+MTL technique, the sentiment classification model applying the SVM+MTL technique based on A is selected to establish the integrated sentiment classification model. Accordingly, probability values of sentiment classification models that apply the SVM technique based on B, C, and D are calculated and adopted as group data. Consequently, sentiment classification models are subdivided and defined. Figure 8 illustrates the defined models. Figure 9 presents document-level sentiment classification models. Accordingly, sentiment classification models 1, 2, 3, 4, 5, and 6 are generated.

Experiment Data.
This study employed user review data generated on Amazon (http://amazon.com). Particularly, it collected user review data associated with the categories of movies, games, music, and books, in which users performed enormous purchase activities for relevant products. The data of these user reviews were directly crawled from July 2020 to August 2021, and MFC was utilized to document each user review. Accordingly, an application program for document classification was developed and utilized. First, the list of each item was formed. Then, documents corresponding to Score 4 or higher and documents corresponding to Score 2 or lower were classified and collected as positive and negative documents, respectively. Because documents corresponding to Score 3 were classified as neutral documents, they were excluded. Specifically, 150 movie reviews, 150 games reviews, 126 music reviews, and 134 book reviews were collected. For movie and game reviews, 30 000 review documents were generated for each category. Regarding music and book reviews, 20 000 review documents were generated for each category. A program R was utilized to calculate the occurrence frequency of the remaining terms, except stop words in the entire documents. Table 2 presents the number of documents for each category, number of terms collected via POS tagging performed by the Stanford POS Tagger program, and results obtained from extracting 1000 terms of 4 POS that appeared frequently. Term information extraction techniques were applied to calculate term information values, and the terms were ranked according to each technique. Via this process, this study selected 20 terms of 4 POS that appeared frequently, according to each technique.
Numerous input variables can trigger problems such as an increase in missing values and difficulties in model management. Therefore, a model with a low number of parameters for input variables is more affected by input variables. A significant number of input variables can also generate variables that are correlated to each other. The elimination of these variables reduces the variance of prediction values, thereby enhancing a model with increased prediction power. Hence, this study selected 20 input variables, which had significant effects on classifying the properties of the entire documents and contained the greatest amount of information on document classification, to be used in the documentlevel analysis.
The experimental environment of this study is described as follows. Experiments were carried out in "Matlab" to calculate and evaluate experimental results. The specifications of the computer used in this experiment are as follows: Windows 10, Intel® Core™ i5-10500 CPU@3.10GHz 3.10GHz processor, 16.00-GB RAM, and a 64-bit operating system.

Term-Based Sentiment Classification Model Experiment
Design. To apply balanced data in this experiment, the entire datasets were classified as positive and negative documents. Document-level sentiment classification models that apply SVM, SVM+, and SVM+MTL techniques adopt the Gaussian RBF kernel function and grid search to identify optimal parameters. C denotes the upper limit value and γ represents a parameter that adjusts the weight between decision and modification function, while σ 1 and σ 2 are parameters the Gaussian RBF kernel function in the decision and modification functions, respectively. The   Regarding the group data adopted by the sentiment classification model that applies the SVM+ technique and the sentiment classification model that applies the SVM+MTL technique, information on the year of issue was employed according to items. The confusion matrix was applied to calculate prediction results derived by document-level sentiment classification models that apply SVM, SVM+, and SVM+MTL techniques. Figure 10 presents single model experiment process for sentiment classification.
As presented in Table 3, the confusion matrix shows a relationship between the actual categories and those predicted and classified by models. Accuracy refers to the accuracy of sentiment classification results. With an increase in accuracy, the model improves in its performance. Reproducibility is a ratio of the actual values to the existing sentiment classification results, while precision refers to the accuracy of the prediction. Equations (11), (12), and (13) define accuracy, reproducibility, and precision, respectively. F-measure is a method that evaluates the validity of actual and predicted data by utilizing the confusion matrix to evaluate the performance of a model in the field of data mining. This method can measure classification performance by considering both precision and reproducibility, including reflecting effects triggered when actual data are inclined to a certain target. It is frequently utilized to verify sentiment classification performance. Equation (14) expresses the process of calculating F-measure:  Figure 11: Comparison of overall model performance in movie reviews.

Accuracy
11 Mobile Information Systems this study, both exhibited a prediction accuracy of 85.68%, thereby exhibiting better prediction performances than single models. Figure 11 presents the results obtained from comparing the performance of the entire models based on movie reviews.
The results obtained from analyzing each sentiment classification model based on game reviews in the experiments are presented as follows: the sentiment classification models that apply the SVM technique based on the chisquare statistic and document frequency both exhibited a prediction accuracy of 80.27%, while the sentiment classification models that apply the SVM+ and SVM+MTL techniques based on the document frequency and chisquare statistic exhibited prediction accuracies of 75.27% and 80.72%, respectively. Based on these analytical results, it was verified that the sentiment classification model that apply the SVM+MTL technique based on the chi-square statistic exhibited the best prediction accuracy among the other single models. The sentiment classification model 5, which is an integrated model proposed in this study, exhibited a prediction accuracy of 82.33%, thereby indicating that it exhibited a better prediction performance than single models. Figure 12 presents the results obtained from comparing the performances of the entire models based on game reviews.
The results obtained from analyzing each sentiment classification model based on music reviews in the experi-ments are presented as follows. The sentiment classification models that apply the SVM, SVM+, and SVM+MTL techniques based on the chi-square statistic, TF-IDF, and chi-square statistic exhibited prediction accuracies of 80.45%, 75.23%, and 79.85%, respectively. Based on these analytical results, it was verified that the sentiment classification model that applies the SVM+MTL technique based on the chi-square statistic exhibited the best prediction accuracy among the other single models. The sentiment classification model 4, which is an integrated model proposed in this study, exhibited a prediction accuracy of 80.96%, thereby exhibiting a better prediction performance than single models. Figure 13 presents the results obtained from comparing the performances of the entire models based on music reviews.
The results obtained from analyzing each sentiment classification model based on book reviews in the experiments are presented as follows: the sentiment classification model that applies the SVM, SVM+, and SVM+MTL techniques based on the chi-square statistic, document frequency, and chi-square statistic exhibited prediction accuracies of 80.93%, 72.29%, and 81.00%, respectively. Based on these analytical results, it was verified that the sentiment classification model that applies the SVM+MTL technique based on the chi-square statistic exhibited the best prediction accuracy among the other single models. The sentiment classification model 6, which is an integrated model proposed in this  Figure 12: Comparison of overall model performance in game reviews.

12
Mobile Information Systems study, exhibited a prediction accuracy of 81.11%, thereby exhibiting a more improved prediction performance than single models. Figure 14 presents the results obtained from comparing the performance of the entire models based on book reviews. Tables 4 and 5 present the results obtained from analyzing sentiment classification models based on four cases (i.e., movies, games, music, and books). The analytical results indicate that the sentiment classification model that applies the chi-square statistic exhibited the most effective performance in terms of extracting core terms. This result was obtained because the chi-square statistic exhibited a close relationship with documents and terms. In other words, because this technique is closely related to terms with significant information power, the significance of these terms was reflected in documents.
This study predicted the performance of sentiment classification models that apply SVM, SVM+, and SVM+MTL techniques according to the amount of data and application of techniques. It was inferred that the sentiment classification model proposed in this study exhibited the best performance among the others. In addition, because the sentiment classification model that applies the SVM+MTL technique adopted group data for training, verification, and evaluation, it exhibited better performance than the sentiment classification models that apply the SVM and SVM+ techniques, respectively.
It was expected that the sentiment classification model that applies the SVM+ technique might exhibit better performance than the sentiment classification model that applies the SVM technique. However, the following problems were observed in this experiment. The first problem is related to the selection of group data. In this experiment, the year of product issue was selected as group data in the same experimental environment. Because the groups did not have a close relationship with each other in the sentiment classification model that applies the SVM+ technique, this model exhibited the lowest prediction performance among other models. The second problem is related to the amount of data.
This study conducted an experiment by extracting sample data and employing a small amount of data. The results obtained from the experiments indicate that the model that applies the SVM+MTL technique exhibited the best performance, while the model that applies the SVM+ technique exhibited a better prediction performance than the model that applies the SVM technique. However, as the amount of data increased, the model that applies the SVM+   Figure 13: Comparison of overall model performance in music reviews. 13 Mobile Information Systems technique exhibited the poorest performance, contrary to expectations. The final problem is related to the selection of the kernel function and parameter adjustment. Accordingly, the results obtained from the experiments conducted in this study indicated that an appropriate kernel function and parameter should be selected.
In this study, the performance of the proposed sentiment classification model was evaluated by comparing it with the prediction performance of single models. The results obtained from analyzing the performance of sentiment classification models based on movie, game, music, and book reviews indicate that the sentiment classification model that