Microblog Emotion Analysis Method Using Deep Learning in Spark Big Data Environment

,


Introduction
e development of network technology makes users communicate more and more frequently online, including blogs, forums, and e-commerce website comments [1]. Users express their feelings about certain events or things by publishing information. Analyzing the words in social networks can help the government and other management institutions understand the social mood uctuations, conduct public opinion analysis, further judge the development of the situation, give reasonable guidance, and maintain social stability [2][3][4][5]. From a commercial perspective, with the rise and popularity of e-commerce platforms such as Taobao and Amazon, users can give product evaluation after purchase, making the information of purchasing products more transparent. e quality of product comments will greatly a ect users' purchase desire.
According to the granularity of research, emotion analysis tasks can be divided into three categories: document level, sentence level, and aspect level [16]. Document level emotion analysis regards the whole document as a basic unit and believes that a document as a whole only expresses one polar emotion. However, the document contains multiple sentences, and different sentences may have different emotional polarity classifications [17][18][19]. Sentence level emotion analysis is more fine-grained than document level, which is used to classify the emotional polarity of a single sentence. Aspect level emotion analysis is different from document level and sentence level affective analysis. It will more finely consider the emotion polarity and the target of corresponding emotion. e target here is attribute words or aspects, which usually exist in the form of entity or entity characteristics [20].
Aiming at the problem that the existing methods in the big data environment cannot extract the emotional features of microblog sufficiently and the average accuracy of analysis results is low, a microblog emotion analysis method using deep learning in spark big data environment is proposed. e main innovations are as follows: (1) Jieba word segmentation method is used to process text comments, which effectively reduces the interference of irregular grammar and nonstandard words on the emotion analysis task of microblog text (2) e feature dimension reduction operation is carried out by using the feature selection method of information gain to prevent the dimension disaster caused by too large feature dimension (3) A microblog emotion analysis method based on DBN is established, and the DBN is parallelized through spark cluster, which effectively shortens the training time of the model e rest of the sections are arranged as follows: Section 1 is related work, which introduces the current research status of emotion analysis. In Section 2, the structure and principle of deep confidence network are described. Section 3 describes deep belief network. In Section 4, the proposed DBN microblog emotion classification model based on spark parallel optimization is introduced in detail. Section 5 is the experiment. Section 6 summarizes this study.

Related Works
Deep learning method can better capture the grammatical and semantic features of text, which is a research focus of emotion analysis. Jebbara et al. used the bidirectional gated recurrent unit (GRU) to extract attribute words and specific aspects of emotion and extract features from the text for prediction of sentence labels [21]. Considering the characteristics of part of speech and corpus, Liu et al. proposed a method to complete the task of attribute word extraction by using RNN, which achieved better performance than the traditional system based on conditional random field [22]. In order to overcome the limitation of fixed window size of convolutional neural network (CNN) model and better capture context information, Chen et al. combined with the named entity recognition (NER) task method, proposed a text emotion analysis method based on BiLSTM-CRF model to classify BIO labels of entities in sentences [23]. Yin et al. proposed a long short-term memory (LSTM) model for cross-domain attribute word extraction, which combined the rule-based method to generate the auxiliary label sequence of each sentence [24]. Li et al. incorporated attention into the task of attribute word extraction and aspect category recognition and constructed a truncated historical attention and selective conversion network on LSTM [25]. Wang et al. proposed a GRU-based coupled multilayer attention (CMLA) model to extract attribute words and opinion words [26]. In the learning process, it encoded and decoded the dual propagation of attribute words and opinion words, not just limited to syntactic relations. Zhang et al. proposed a text emotion classification model integrating content features and user features [27]. Jamal et al. proposed a Twitter emotion analysis framework based on the Internet of ings, which used the mixed model of term frequency inverse document frequency (TFIDF) and deep learning model for emotion analysis, filtered the original tweets with the tokenization method, so as to capture useful features without noise information, and used TFIDF statistical technology to estimate the importance of local and global features. e adaptive comprehensive class balance technology is used to solve the class balance problem between different emotions [28]. Jelodar et al. used the LSTM method to classify the comments of COVID-19. e research results have a certain impact on the guidance and decision-making of COVID-19related issues [29]. Wei et al. proposed a BiLSTM model based on multipolarity orthogonal attention for implicit sentiment analysis. Compared with the traditional single attention mechanism model, this method can effectively identify the differences between words and emotional tendencies and has been verified in experiments [30].

Deep Belief Network
3.1. DBN Model Structure. DBN is a neural network model with multiple hidden layers. It is difficult to optimize the weight in deep structures such as deep confidence network, so a greedy unsupervised training method is proposed to solve this problem. Figure 1 shows a structure diagram of a deep confidence network with three hidden layers h 1 , h 2 , and h 3 . x is the input data and y is the output label corresponding to the input data. In the first step, DBN pairs each two adjacent neural network layers, trains the parameters between the two layers with the parameters of the input layer, and constructs the output layer. Moreover, the propagation of input layer and hidden layer is bidirectional, which is divided into forward process and backward process to learn data distribution. is method of building networks between layers is realized by the restricted Boltzmann machine (RBM) model. RBM is a recurrent neural network with two layers. Each node in the same layer is not connected to each other, and the output and input layer nodes are connected symmetrically without direction, which is equivalent to the connection of an undirected graph. An RBM consists of a hidden layer composed of random hidden units and a visible layer composed of random visible units.
Due to the special structure of RBM model, which has connection between layers and no connection within layers, it has the following important properties: when the visible unit state is given, the jth neuron in the hidden layer is calculated according to the neuron state of the visible layer, and the activation probability is as follows: where σ is the sigmoid activation function, v i represents the ith visible unit, h j represents the jth hidden unit, w ij is the weight between the ith visible unit and the jth hidden unit, and b j is the offset threshold of the jth hidden unit. Similarly, when the state of the hidden unit is given, the probability of the binary state v i being 1 can be calculated, that is, the activation probability of the visible unit can be expressed as where a i is the offset threshold of the ith visible unit. For the determination of the deep belief network model, the first thing is to know the number of nodes in the visible layer and the hidden layer. e number of nodes in the visible layer is the input data dimension. Second, the number of nodes in the hidden layer is related to the number of nodes in the visible layer in some research fields, such as processing image data with convolution restricted Boltzmann machine, which is not analyzed here. However, in most cases, the number of hidden layer nodes needs to be determined according to the use, or the number of hidden layer nodes that minimize the energy of the model under certain parameters.

DBN Model Training.
e training of DBN model is divided into two parts: unsupervised pretraining process based on RBM and supervised parameter adjustment process. e unsupervised pretraining process of DBN model adopts the layer-by-layer greedy learning strategy. e initial input layer is the visible layer, and the input data are the text feature vector. e data vector of the visible layer v combined with the weight w 1 is used to infer the data vector of the hidden layer h 1 , which is the training process of RBM1. en, the data vector of the hidden layer h 1 is combined with the weight w 2 to infer the data vector of the hidden layer h 2 , which is the training process of RBM2, and so on. at is, multiple RBMs are stacked, the output of the previous RBM is the input of the next RBM, and the hidden layer of the previous RBM is the visible layer of the next RBM. By stepby-step training to the last layer, the pretraining process of DBN is completed. e specific steps are as follows: Step 1. Randomly initialize the weight (W, a, b), in which W is the weight vector matrix, a � [a 1 , a 2 , . . . , a n ] is the offset coefficients of visible layer, and Step 2. Assign X value to the visible layer v (0) and calculate the probability that the hidden layer neurons can be activated: Step 3. Perform a Gibbs sampling to obtain the value of each neuron in the hidden layer: Step 4. Reconstruct the visible layer v with the obtained h (0) in formula (4) and calculate the probability density: Step 5. Perform Gibbs sampling again and reconstruct the value of each neuron in the visible layer. Let Step 6. Calculate the activation probability of hidden layer neurons again with the reconstructed visible layer neurons: where σ(·) adopts sigmoid activation function, and its function image is shown in Figure 2. Sigmoid is used to activate the function because its definition field is R and its value field is (0, 1). erefore, no matter what range the input data of neurons in the visible layer is, the activation probability of nodes can be obtained by sigmoid function.
Step 7. Obtain the new weight vector matrix W, visible layer offset coefficient a, and hidden layer offset coefficient b: where ε is the learning rate.
To sum up, pretraining only needs to iteratively calculate RBM1, RBM2, and RBM3 parameters in turn and finally get the best weight (W, a, b). e supervised parameter optimization training of DBN model first uses the forward propagation algorithm to determine whether the hidden layer neurons are activated by using the parameters W and b obtained in the pretraining. Let l be the number of layers of the neural network and calculate the excitation value of each hidden layer neuron: en, we propagate upward layer by layer, calculate the excitation values of neurons in all hidden layers using formula (9), standardize them with activation function, and finally calculate the excitation value h (l) and output vector X of output layer: en, the back propagation algorithm is used to update the parameters of the whole DBN network. e back propagation algorithm adopts the reconstruction error criterion, and the cost function is as follows: where E is the reconstruction error, X l is the actual output of the output layer, X i is the theoretical output of the output layer, and (W (l) , b l ) represents the weight and offset coefficient of the layer l. e reconstruction error can reflect the likelihood of the training data to a certain extent. Finally, the gradient descent (GD) algorithm is used to update the weight and offset coefficient of the whole DBN network: To sum up, the training purpose of DBN model is to maximize the fitting of input data, and the output result is the reconstruction of training data. e visible layer neurons transfer their own features to the hidden layer neurons. e hidden layer neurons capture the higher-level features shown by the visible layer neurons through iterative training, so as to enhance the ability of feature extraction of the model.

DBN Microblog Emotion Classification
Model Based on Spark Parallel Optimization Figure 3 shows the work flowchart of microblog emotion analysis of the proposed method. Before classifying microblog emotion, it must be processed into a form that can be calculated by computer, that is, the representation model of data. en, an emotional dictionary is built, the emotional features are extracted in the microblog text, the extracted features are taken as input, the whole spark parallel DBN model is trained, the classification results are obtained, and the emotional analysis of the microblog text is realized.

Microblog Preprocessing and Feature Vector Construction
Text preprocessing is an indispensable part of the task of text emotion analysis. In text comments, due to the great differences in everyone's emotional thinking and speaking methods, it is often filled with strong personal emotional styles. All kinds of irregular grammar and nonstandard words will interfere with the task of text emotion analysis, so text preprocessing is very important. e text preprocessing part of this study includes as follows: filtering out repeated corpus, filtering out irregular words, removing stop words, emoticon processing, and Chinese word segmentation. e Chinese word segmentation part selects Jieba word segmentation. Jieba word segmentation can collect the dictionary established by users, and its Chinese word segmentation effect is good, which can well meet the needs of this study.

Feature Construction.
Text feature selection is a key step of machine learning, which determines the accuracy of emotion classification. is study selects four categories of features: features based on emotional rules, unigram features, syntactic features, and dependent word collocation features. e rule feature based on emotion is the feature obtained by extracting its effective information after improving the new rule method on the basis of predecessors. Considering that phrase structure can reduce sentence ambiguity, we add bigram and its combined part of speech tagging as features to the feature set. Dependency feature is the dependency identifier obtained from the dependency parsing tree. It plays an important role in the annotation of emotional category information and can save the information directly related to emotional words and other hidden information.
e method based on emotion dictionary plays an important role in the development history of text emotion analysis. Its core idea is to superimpose the polarity of emotion words and judge the emotional tendency of the text by numerical value. e formula of the classical method is as follows: In the above formula, the parameter Sw i represents the polarity of emotional word i. e parameter n represents the number of emotional words in the text. e method based on emotional dictionary can barely complete the task in some simple text tests, but considering the complex text grammar and the existence of various language structures in real use, the actual use is limited. erefore, considering the defects of classical methods, a new emotion rule method is proposed. Considering that the length of the comment text is generally short and is basically a separated sentence, the method takes each clause as a meta unit. On the basis of considering the negative words, connectives, and other grammatical structures, the emotion calculation formula (equation (14)) is proposed to calculate the emotion tendency of each unit. e final text emotion tendency is judged by the value obtained by the superposition of the score values of each unit. If the score value is positive, the text emotion is classified as positive; if the score is negative, the text emotion is classified as negative: where the parameter n represents the number of emotional words in the text, the parameter Pw i represents the emotional extremum of emotional word i, the parameter m represents the number of words modifying emotion word i, the parameter mod j represents the weight of the corresponding modifier, and the parameter k represents the weakening or strengthening coefficient of rules. is parameter exists to solve a problem often ignored in emotion analysis tasks-the deviation of emotion analysis results caused by subject confusion. Table 1 lists a brief description of the emotional rules designed by the proposed method. Generally speaking, the more complete the emotional rules are, the better the effect of the emotional rule method is. After combining the  Mobile Information Systems emotional rules, the final score is calculated according to formula (14), and then, three parameters are extracted as emotional features: the score of emotional words, the number of positive/negative emotional words, and the ratio of strengthening/weakening times of rules.
For the other three emotional features, "the scenic spot service is really good, I like it very much!" is taken as an example sentence to show the feature extraction process and the corpus is input into Jieba word segmentation to get "scenic spot /n service /n really /ad good /a , /wd I /rr very /d like /vi ! /wd", where /n stands for noun, /ad stands for adverbial word, /a stands for adjective, /wd stands for punctuation mark, /rr stands for pronoun, /d stands for adverb, and /vi stands for verb.
Based on the above results of word segmentation and tagging, the syntactic features can be obtained: scenic spot service, service really, really good, good I, I very, like it very much, n, ad, a, rr, d, vi. e number of features is 12. After the result of word segmentation is obtained, the dependency and word collocation features of the input example sentences can be obtained by calling the StanfordNlp natural language processing toolkit. e specific relationship and collocation are listed in Table 2.
In practical use, in order to avoid various problems caused by excessive feature dimension, the feature selection method of information gain (IG) is adopted for feature dimension reduction. e formula is as follows: where the parameter P(C i ) is the probability of category C i , the parameter P(t) is the probability of feature t, the parameter P(C i |t) is the probability of simultaneous occurrence of feature t and category C i , and the parameter PP(C i |t) is the probability that the category C i appears when the feature t does not appear. e score of the feature is calculated according to the formula, and the feature of TOP N is selected according to the score, so as to select and reduce the dimension of the feature.

Parallel Optimization of Emotion Classifier Based on Spark Platform.
e master node provides initialization parameters θ � W, b, c { } for training and distributes them to each worker node. Each worker node uses the training data on all split slices for parameter learning and uses minibatch as the criterion for training parameter update. When the worker node completes the training data of a batch, the generated parameter change Δθ is sent to the master management node for parameter update until all training is completed, and the feature data processed in each training are converted into RDD form for storage. e specific algorithm is shown in Algorithm 1.
e parallelization structure of DBN network based on spark platform is shown in Figure 4.

Experimental Data and Evaluation Indices.
e dataset of this experiment comes from COAE2015 Task 3. ere are 133201 microblog sentences, including a large number of interfering sentences. Datasets are divided into four different areas to evaluate, including books (BOO), audio products (DVD), electronic products (ELE), and kitchenware (Kit). Each dataset contains 2000 positive and 2000 negative comments.
In this study, the accuracy is used as the evaluation index of the experiment, and the calculation formula is as follows: where Num(correct) is the number of samples correctly predicted by emotion classification and Num(all) is the total number of samples in the test corpus.

Relationship between Iteration Times and Prediction
Accuracy. e advantage of deep neural network over shallow neural network is that it can iteratively learn, extract features, and constantly modify the model, but too high or too low iteration times will affect the overall performance. In a task, if the number of iterations is lower than a certain value, it will lead to incomplete learning of features and imperfect release of performance. If the number of iterations is higher than a certain value, it will take a too long time and be inefficient. erefore, the selection of iteration times is very important in the task. In the experiment, with Ft1 as the  Expression emotion rule When an expression word is detected, the corresponding score is given directly according to the expression dictionary. Emotional rules of turning conjunctions When the preinflection word is detected, it is weakened according to the strength value of the conjunction dictionary. Borrowed emotion rules Used to strengthen/weaken demonstrative pronouns.
feature, the relationship between prediction accuracy and iteration times is shown in Figure 5. It can be seen from the figure that when the number of iterations is less than 60, the recognition rate increases significantly with the increase of the number of iterations. When the number of iterations is 65, the change range of accuracy is small and almost reaches a balanced state. Based on the above analysis, for the number of iterations, 65 iterations are selected to ensure the stability of the results. Table 3 lists the experimental results of the text emotion classification method based on deep belief network designed in this study. In the network, the input is the vector composed of 1000-, 2000-, and 4000-dimensional features with the top information gain. e text abstract features are learned through hidden layer nonlinear mapping. e specific results are as follows: for the 1000-dimensional feature set, the training iteration of restricted Boltzmann machine is 100 times, and the node parameter corresponding to the network structure "input layer-hidden layer-output layer" is "1000-300-100." For the 2000-dimensional feature set, the training iteration of restricted Boltzmann machine is 100 times, and the node parameters corresponding to the network structure are "2000-600-300." For the 4000-dimensional feature set, the training iteration of restricted Boltzmann machine is 100 times per layer, and the node parameters corresponding to the network structure are "4000-600-300." It can be seen from Table 3 that the method based on depth belief network achieves the best classification accuracy of 90.94 when the structure is 2000-600-300 and the four features are combined.

Experimental Results and Analysis of Emotion Classification under Different Methods.
In order to verify the learning and expression ability of the method in this study, the same features are used to compare the methods in reference [27], reference [28], and the proposed method. e recognition rates of reference [27] and reference [28] are 87.11% and 87.69%, respectively. When the structure of the proposed method is 2000-600-300, the combination of four features achieves the best classification accuracy of 90.94%. Moreover, it can be found that the overall accuracy of the proposed method is higher than that of the methods in reference [27] and reference [28], because the proposed method will obtain more emotional knowledge than the comparison methods in the learning of features, so as to obtain better performance, as listed in Table 4.

Microblog Emotion Analysis Results under Spark Platform.
e DBN network is optimized in parallel under the spark platform.
e spark cluster used in the experiment is composed of 10 servers. One server is used as the management node of spark cluster, and the other nine servers are used as the computing nodes of spark cluster. e hardware configuration is CPUXeonE5520, 20 GB memory, and 1 TB hard disk. In Figure 6, the abscissa represents the size of the training data and the ordinate represents the time-consuming. It can be seen from the figure that when the amount of data increases to 60000, the spark training time is only 27.78% of the single machine training time. e Jieba word segmentation method is used to reduce the interference of irregular grammar and nonstandard words on the emotion  analysis task of microblog text. e feature dimensionality reduction operation is carried out by using the feature selection method of information gain to avoid the problem of dimension disaster. It can be concluded that the parallel DBN algorithm based on spark platform can effectively improve the operation efficiency when processing massive data.

Conclusion
Aiming at the problem that the existing methods in the big data environment do not extract the emotional features of microblog sufficiently and the average accuracy of the results is low, a microblog emotion analysis method using deep learning in the spark big data environment is proposed. e DBN is parallelized through spark cluster, which greatly shortens the training time. Experimental results show that the proposed algorithm has good microblog emotion analysis ability.
In this study, the factors considered in the study of data parallel fragmentation strategy are not comprehensive enough. More data fragmentation strategies should be tried in the future. In the follow-up, other parallel optimization algorithms can be used for reference to improve the parallel speedup ratio of the algorithm. Moreover, in addition to word vector representation, researchers have developed new representation methods in recent years, such as Atlas and tree database, to represent text information. erefore, the text emotion classification algorithm proposed in this study can be further improved. How to embed more and more effective text semantic information is still the focus of the next step.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.
Input: Training data set S, set S as the feature vector set after microblog preprocessing Output: Emotion classification result set Determine the number of iterations K and the parameter θ 0 for initializing RBM For i � 0 to K do e Master node broadcasts θ i to each Worker node; e Worker node uses the data on Split to train the parameters of RBM network; All Worker nodes send Δθ to the Master node; e Master node calculates θ i+1 � 1/n θ i j . e feedback mechanism of BP network is used to adjust and fine-tune the DBN network model. End ALGORITHM 1: Spark parallelized DBN network.