A Deep Neural Network Model for the Detection and Classification of Emotions from Textual Content

Emotion-based sentimental analysis has recently received a lot of interest, with an emphasis on automated identiﬁcation of user behavior, such as emotional expressions, based on online social media texts. However, the majority of the prior attempts are based on traditional procedures that are insuﬃcient to provide promising outcomes. In this study, we categorize emotional sentiments by recognizing them in the text. For that purpose, we present a deep learning model, bidirectional long-term short-term memory (BiLSMT)


Introduction
For a long time, computer-based emotion detection and classification has been a hot topic of research. Emotion detection can be done using a variety of mediums, including text, photographs, video, and audio [1,2]. In recent years, social media sites such as Twitter, Facebook, and Instagram have undergone an unexpected global expansion. Twitter, for example, had over 200 million monthly active users by the fourth quarter of 2021 [3]. As long as analytical researchers working on social networking material can address the particular obstacles presented by such content, advances in computational linguistics and text analytics allow researchers to extract and evaluate textual emotions provided by users on social media via big data sources.
Computational linguistic experts have performed many studies to detect and identify emotions at various levels, including words, expressions, sentences, and analysis [4][5][6][7]. Many studies, on the other hand, focus on emotion-related bearing terms, with little attention paid to textual clues to emotions, which, if included, may improve the output of cognitive-based sentiment classification for social media data. Hence, the research and development of text-based emotion classification systems are motivated by studies of emotions expressed in text. e proposed emotion-based sentiment analysis system for social media is based on previous work on emotion classification in the social media paradigm [5,8]. Previous studies lacked the capability to represent emotion signals using advanced feature representations and do not exploit the capability of individual deep learning models for emotion classes. However, we present a social media-based cognitive-based sentiment analysis framework that focuses on detecting and classifying textual emotions using an advanced feature representation scheme and a fusion of deep learning models.

Research Motivation.
Over the last few years, emotionbased SA applications have become increasingly popular on the Internet for gauging the opinions, emotions, and sentiments of individuals on different issues and policies. However, it is often difficult to analyze text with existing emotion detection methods to detect emotions from social media content. erefore, it is essential to extract and analyze social media content to automatically classify emotions. e difference between social media and traditional blogs is that the former incorporates complex textual material. When opposed to text information alone, such material contains text and emotion cues, which are more suitable for expressing and conveying people's subtle thoughts, emotions, and personal characteristics [1]. However, emotion-based sentiment analysis, which is based on the identification of emotional cues, is still in its early stages.
ere has been extensive work carried out in the area of text-based analysis of feeling [8], construction of lexicons [4], cognition and analysis of aspects of feeling [6], and analysis of visual feelings [1,9]. However, further research is required in the area of cognitive-based social media analysis, with a focus on extracting and categorizing emotions from social media content.

Problem Statement.
In this work, we address the problem of emotion classification of English text using a deep learning technique. Considering a set of reviews � {R1, R2, R3, . . ., Rn} given as input, the aim is to develop a classifier for assigning an emotion label Ei ϵ {J-S, F-G, F-S} for the review ri, where J-S, F-G, and F-S show different emotion labels. It is our goal in this study to develop a powerful deep learning-based emotion recognition system that is capable of accurately classifying provided text reviews into the required emotional category. e goal is to improve the scientific literature by detecting emotional indicators in textual material on social networking sites. e proposed system will help companies identify and analyze their consumers' attitudes and emotions toward a program, policy, or product, which can help them make better decisions. e following is how the rest of the article is organized: (i) Section 2 discusses the current methods and frameworks related to emotion-based sentiment analysis, (ii) Section 3 presents the proposed framework for emotion-based sentiment analysis, (iii) experimental setup is introduced in this Section 4, and (iv) the proposed system is concluded with its drawbacks and potential future guidance in Sections 5 and 6.

Related Work
Due to the ambiguity of emotions and a large number of emotional terms, detecting and classifying emotions in sentiment analysis is a challenging activity. ere have been several techniques and methods proposed for detecting and classifying emotions from social media textual content. is segment gives a summary of the current state of emotion detection and classification science. e efficacy of different machine learning models is evaluated by [8]. Variant machine learning models such as SVM, NB, DT, LR, XGBoost, KNN, and backpropagation neural (BPN) classifier are tested using the state-of-the-art ISEAR emotion dataset to achieve this goal. According to the results, BPN had the highest accuracy of all the classifiers, with a score of 71.27 percent. e work has a few limitations: (i) it is limited to five emotion classes, (ii) it uses only the ISEAR dataset, (iii) conventional features are used, and (iv) classical machine learning models are used. Future directions may include utilizing various emotion combinations, (ii) utilizing other benchmark datasets, (iii) conducting additional deep learning experiments, and (iv) utilizing a word embedding feature representation approach. e authors of [10] are tasked with categorizing emotions in relation to Indonesian texts. e study used a variety of machine learning algorithms, including NB, SVM, KNN, and minimal optimization, to achieve this goal. Various text cleaning tasks (tokenization, stop word deletion, and case 2 Complexity conversion) are also introduced. During the tests, 10-fold cross-validation was used, and the findings show that the minimal optimization technology outperformed other techniques. For the extraction of variant tweets, we use emotion-word hashtags as well as the "Hashtag Emotion Corpus" data collection [11]. Furthermore, the study used an emotion-labeled tweet dataset to create an emotion dictionary of rich terms. Experiments show that the output of the SVM model is most closely linked to the primary emotion classes. e inclusion of emotion words with variant synonyms, on the other hand, can aid in improving the system's effectiveness. e identification of emotions in the human speech was performed by [12] in their research. Different speech recognition models are used, which are then followed by various speech features (peak to peak distance). In comparison to alternative methods, the dataset used in the experiments contains 30 distinct subjects, and the highest precision is achieved. An et al. [13] tackled the issue of music emotion classification based on lyrics in their research. ey accomplish this by crawling the music lyrics with their tags on Baidu, a well-known network platform. Following that, a naive Bayes classifier that has been trained on four different datasets is used. According to the findings, the final classification accuracy on D-4 was 68 percent (Dataset). Using other algorithms, on the other hand, could improve the proposed system's effectiveness. Dini and Bittar [14] conducted research to classify emotions on Twitter. Two corpora have been created for this purpose: ETCC for emotion classification of tweets and ETCR for emotion relevance of tweets. is dataset is used to train and evaluate machine learning models, as well as to test a rule-based model. e results show that the symbolic technique outperformed the ML algorithm in determining tweet significance, but the ML algorithm is the best option for tweet emotion classification. e goal in the future is to create a hybrid model that combines both approaches, and marked data quality testing will be discussed as well. Kaewyong et al. [15] suggested a lexicon-based approach in their work on automated feedback inquiry for student feedback. To begin, data was gathered from over 1100 student responses about teaching faculty. Following that, preprocessing techniques are used, followed by using a sentiment lexicon to assign sentiment scores to opinion terms. In comparison with the other approaches, the findings show that the proposed solution has the best efficiency. Sen et al. [16] set out to determine a novel method for multitask learning of input word embedding by using supervised prompts for emotions and sentiment tasks. After that, they looked at using jointly qualified embedding to improve emotion detection. With a test accuracy of 57.46 percent, the findings show that KMeans (ES-SWE) outperformed the other approaches. In the future, it is expected to perform a configuration experiment in which word embedding is mutually trained, sensitive to emotion-topic, and used for emotion or topic downline recognition. Furthermore, their proposed approach can be used to solve other classification problems besides emotion and sentiment identification. e work can also be applied to n-gram embedding. Kollias et al. [17] investigated deep convolutional neural networks for facial expression and emotion detection (DCNNs). In comparison with the competing models, the experimental findings show that the proposed model outperforms them. However, accurate device estimation can be accomplished by developing a real-world application for human-computer communication. Poria et al. [18] use the convolutional learning approach to tackle the problem of extracting emotions from the media content, specifically text, audio, and video. In comparison to the baseline work, the inner layer of the network used an activation feature and achieved an accuracy of 96.55 percent for the MOUD dataset and 76.85 percent for the IEMOCAP dataset. To conduct microblog sentiment analysis, Severyn and Moschitti [19] proposed a deep learning-based Convolutional Neural Network (CNN). e input to his proposed model is seed words that have been correctly trained using a deep learning model.
e key advantage of the framework is that it does not need support attributes to train the model on Twitter data records. e proposed model achieved the best results at the sentence and phrase stages, according to the results of the experiments. Gupta et al. [20] present a new profound learning approach to emotion identification in textual conversations, namely, sentiment and semanticized LSTM (SS-LSTM). Two datasets, ISEAR and SemEval2007 Affective Text, were used in the experiments.
e experimental results show that the proposed method outperformed state-of-the-art machine learning as well as other deep learning approaches, with an average F1-score of 71.34 percent. However, the technique can be improved by using a context-sensitive framework to train a model. Cambria [21] uses effective computer methods to investigate the problem of the identification of feelings and emotional predictions. Text classification is carried out in the positive and negative sense module, and emotional clues are identified in the emotional prediction system. First, feelings are found in the proposed approach, and then a certain type of emotion is assigned to the unwanted feeling. To define and classify emotions from textual material, a number of research works are categorized as emotions, as seen in the literature review above.

Research Gap.
To resolve the limitations of the aforementioned recent emotion-based sentiment classification techniques, it is necessary to detect and identify emotions from textual material. To close this void, we propose a comprehensive emotion-based sentiment system for emotion classification expressed in online social media. As a result, for emotion classification, we used automated feature engineering techniques followed by a deep neural network called Bi-LSTM.

Proposed Methodology
e current section presents the proposed method for emotion classification's comprehensive architecture.

Acquiring Data.
We have obtained a benchmark data set called "ISEAR" [8] containing 5474 records. Each input review is tagged with a separate class of three emotion class Complexity 3 sets such as J-S, F-S, and F-G. is work employs the Python language [1], and the library used is Keras, which is based on the TensorFlow deep learning framework [1]. Figure 1 provides a description of the dataset.

Training Set.
e model has been trained using the training dataset, with 80 percent of the data used for model training [1]. Figure 2 shows an exemplary review of training examples.

Validation Set.
In the training phase, the model is typically accurate, but in the testing phase, the model's performance declines. As a result, in order to overcome the model's performance error in terms of underfitting and overfitting, the validation set must be used [22]. Keras has two methods for determining the optimum model parameters: manual data validation and automatic data validation [3]. We are currently employing manual data validation in our current project.

Testing Set.
Based on new/unseen cases, the testing set is used to evaluate the performance of the model. It is used once the model has been trained properly using both the train and validation sets. e test set is responsible for the model's ultimate prediction [23]. Table 1 contains test set samples for emotions classification.

Main Modules of the Proposed System.
e proposed method consists of three main modules: I Embedding Layerbased Word Representation, (ii) Bi-LSTM-based Forward and Backward Context Information Saving, and (iii) Sigmoid Layer-based Classification. e first module's goal is to obtain a numeric representation of the terms, which will be fed into the second module, which will produce an encoded representation of features. Bi-LSTM is used to create this encoded representation, which keeps track of both the forward and backward contextual details of the word within a series. Finally, a sigmoid activation mechanism is used to perform classification in the final module (see Figure 3). e following is a breakdown of each module:

Words Representation Exploiting Embedding Layer.
e emotion dataset is represented as a set of more than one user review and an individual review E "I felt very happy when I won the soccer pool" involving a series of r words, i.e., w1, w2, w3, and wr. A single term wi "felt" represents an embedding vector wi-Rn that includes real values [0.6, 0.9, 0.2]. An embedding matrix is made up of the embedding vectors of each term. During this project, the kera embedding layer was used. e embedding matrix is a twodimensional matrix denoted by D R (rn), where r denotes the length of the input review and n denotes the embedding dimension. After that, the next layer receives an embedding matrix D, also known as a sentence/input matrix.

Bi-LSTM Layer.
e Bi-LSTM layer is responsible for learning long dependencies. It aids in the saving of the two preceding and succeeding contexts in the form of an encoded user analysis. A unidirectional LSTM, on the other hand, saves only the information from the previous context, leaving out the information from the subsequent context. As a result, Bi-LSTM gathers even more information for encoded review processing. To learn past and future meaning knowledge of tokens (words), Bi-LSTM employs forward and backward LSTM [2]. Forward LSTM. During forward LSTM, the processing of the sequence is done from the left toward right through the concatenation of binary inputs. e first one is present (current) input "x 1 ," while the second one is prior input/hidden state "h t−1 ." Forward LSTM generates a certain outcome "h ← " regarding a provided sequence of input: x 1 , x 2 , x 3 , . . ., x y−1 . Backward LSTM. For backward LSTM, the processing of the sequence is done right toward the left through the concatenation of two inputs, the first one is the current input "x 1 ," and the second one is the next input/future hidden state "h t+1 ." Forward LSTM generates a certain outcome "h ← " regarding a provided sequence of input: x z+1 ,. . ., x 3 , x 2 , x 1  e integration of forward and backward output is conducted through the computation of an element-wise sum (see (1)).

Review
Text ID 1.

Review Text
When I was involved in a traffic accident.
When I lost the person who meant the most to me.
When I did not speak the truth. When I got a letter offering me the Summer job that I had applied for.
When I was going home alone one night in Paris and a man came up behind me and asked me if I was not afraid to be out alone so late at night.
When my friends did not ask me to go to a New Year's party with them.
Lastly, the new representation of a sentence matrix (1) is fed into a classification layer regarding final classification work. e equations ((2)-(7)) used for forward LSTM [24] are given as follows: e equations ((8)-(13)) used for backward LSTM [24] are given as follows:  Complexity where n represents the size of the input, m represents the size of the cell state, Every gate in Bi-LSTM performs its own function. e function of the forget gate f t is to delete useless information.
e input gate i t takes a decision about storing which information. Lastly, the output gate o t calculates the final output h t . e notations used during forward and backward LSTM are listed in Table 2.

Feature Classification Using Sigmoid Layer.
is layer performs the classification of input features (final representation) obtained from the previous module. We add a dense layer with two neurons that have a sigmoid function for this purpose. e sigmoid activation function performed a nonlinear operation, and its task was to calculate the probability of various emotion classes. It converts the weighted sum into a number between 0 and 1. erefore, after the output layer passes the review text "I felt very happy when I won the football pool," it is tagged with one of the three binary classes "J-S," "F-S," or "F-G." e probability of each of the emotion classes is calculated using a softmax activation function. e net input for classifying the final emotion representation (equation (14)) can be approximated as follows: where "w" denotes a weight vector, "x" denotes a vector of inputs, and "b" denotes a bias factor. e phases of the Bi-LSTM system for emotion categorization are depicted in Algorithm 1.  Table 3. According to the experimental findings, the BILSTM model performed better for the emotion clue "joy" with an F1-score of 0.89 and recall of 0.91, while both emotion clues "joy" and "sadness" performed best in terms of precision (0.88). e overall precision is 0.88 percent.  Table 5. According to the findings, the BILSTM model performed well for both "Fear" and "Shame" emotion clues, with accuracy (0.89), recall (0.89), and F1-score (0.89). e overall precision is 0.89 percent.

Answer to RQ2.
To find the answer to RQ2, "What is the efficiency of the proposed technique in relation to other machine learning and deep learning techniques?", we conducted experiments to compare the efficiency of a word embedding scheme trained using the BILSTM method to machine learning classifiers that use traditional feature representation schemes such as the TF-IDF and Count-Vectorizer. e assessment results are presented in Tables 6  and 7.

BILSTM and Machine Learning Approaches Attempting to Exploit Traditional Features in Comparative
Study. e proposed BILSTM model's output is compared with that of various machine learning approaches using classical feature representation schemes, as shown by the following experiments:     ML Driven. We estimate the output of different classical feature representation strategies for machine learning classifiers, such as Countvectorizer and TF-IDF, in these experiments. e method used in Countvectorizer is called count-of-words, and the text is converted into a feature vector using the TF-IDF scheme. Many classifiers are employed by Countvectorizer, TF, and TF x IDF feature encoding techniques, with KNN achieving the best accuracy score of 80.82 percent and XGBoost achieving the worst accuracy score of 71.23 percent. Proposed BILSTM Model. We used the BISLTM model over a word embedding feature representation scheme to perform an emotion classification task in this experiment. e main advantage of using a word embedding over the traditional BOW system is that the BOW model's output degraded as the vocabulary grew larger, while deep neural network models yielded more successful results. Table 8 shows that the BILSTM (proposed method) exploits a word embedding scheme to generate more efficient findings than traditional feature representation methods such as Countvectorizer and TF-IDF.   Tables 8 and 9 are summarized as follows.

Comparison of Proposed BILSTM with Variants of
(2) Comparing Proposed BILSTM with Individual CNN. In the initial experiments for the emotion classification task, the proposed BILSTM model is compared to the individual CNN. In terms of accuracy, precision, recall, and F1-score, the proposed model outperformed the individual CNN model (accuracy � 88 percent, precision � 88 percent, recall � 88 percent, and F1-score � 88 percent). e decrease in CNN performance is due to the fact that the text classification task requires the preservation of sequential data, which a single CNN cannot help with. In addition, a large dataset must be given to the CNN model in order for it to enhance its accuracy.
(3) Comparing Proposed BILSTM with Individual LSTM. In the second experiment, an Individual LSTM is compared to the proposed model in an emotion classification task. e downside of the unidirectional LSTM is that it only retains previous information and not subsequent information. Maintaining information on both sides of words (previous and subsequent) helps in a greater understanding of sentence context. As a result, when compared to the proposed deep neural network model, an Individual LSTM layer will reduce the performance by up to 10% for J-S, 2% for F-G, and 1% for F-S, as shown in Table 8.
(4) Comparing Proposed BILSTM with Individual GRU. For the classification of emotions, a performance comparison between the proposed BILSTM model and an Individual GRU is performed in the third experiment. e main disadvantage of GRU is that it does not perform feature extraction and instead focuses on preserving contextual details. In comparison with the proposed BILSTM model, GRU's output dropped by up to 6% for J-S, 3% for F-G, and 4% for F-S, as shown in Table 8.   When we ran a McNemar-type of the test, we found that the chi-squared statistic was 5.5.3, and a 1 represents the degree of freedom, which means that the 2tailed p value is 0.021. Consequently, the null hypothesis is deemed to be false, and an alternative hypothesis is adopted.
8 Complexity degraded because it is unable to manipulate additional broad sequences, demonstrating the RNN's inability to monitor long-range relationships. It is necessary to retain the details for a long time in order to maintain the context, which an Individual RNN cannot do because it just keeps track of short-term memory sequences. In comparison to the BILSTM, RNN resulted in a performance decline of up to 22% for J-S, 11% for F-G, and 6% for F-S, as shown in Table 8.

Why the Proposed BILSTM Model
Is Better? As we employed "Bidirectional" LSTM (BILSTM) on the supplied datasets, the findings outperform other deep learning approaches. e purpose of the BILSTM model is to retain background data from both sides of an expression, i.e., the left and right sides, within a phrase. After receiving data from the embedding layer, the BILSTM generates an enhanced encoding of the data that takes into consideration both the current and earlier input information. As a consequence, it is obvious that the BILSTM deep learning model can successfully collect current and past background information through time and produce predictions.
e suggested deep learning model performed well on the dataset in categorizing the input text into various emotions such as Pleasure, Sadness, Fear, Guilt, and Shame.

Answer to RQ.3.
To answer RQ3, "How to estimate the efficiency of the proposed technique regarding emotion classification with regard to baseline studies?", we ran an experiment to see how well the proposed BISLTM model performed in comparison to the baseline sample. e experimental results are mentioned in Table 6.

Proposed (BILSTM) Compared with Baseline (BPN).
In this experiment, we compared the output of the proposed BILSTM approach with that of [8], who used a backpropagation neural classifier to classify emotions. With an accuracy of 87.66 percent, the proposed method outperformed the current state-of-the-art analysis, as shown in  Table 9: Evaluation results of machine learning and the proposed technique.
e following are the reasons for our model's improved performance: e reason for our improved model output is that we used a "bidirectional" LSTM model that is effective at maintaining both the left and right contextual details of the series. Furthermore, BILSTM is good at storing knowledge over a long period of time. As a result, keeping information for a long time is very useful for text classification and prediction tasks. BPN, on the other hand, is not suitable for classification issues [1]. In addition, the BPN classifier has issues with delay convergence, neural network weights declining over local optima, and network insensibility [28].

Significance Test.
We also ran experiments to see whether the proposed BILSTM classifier, which uses the word embedding function, is statistically different from KNN for J-S and MNB for F-G and F-S, to perform a significance test for the "emotion classification task." e findings suggest that our proposed BILSTM model's efficacy in the emotion classification task for J-S, F-G, and F-S was greatly enhanced by innovative features (word encoding), as seen in Tables 7, 10, and 11.

Conclusion
We further analyzed why and how users react in different emotional states. To carry out the research task, we proposed applying a Bi-LSTM technique. e study will include the following modules: (i) take the data, (ii) prepare the data, and (iii) apply the deep learning algorithm. For the emotion classifier task, we used a deep learning model, namely, Bi-LSTM. Bi-LSTM performs two tasks at once; i.e., it can remember both the forward and backward sequence of the previous time sequence [8]. After using numerous other programs to encode the text, the decoded text is manually classified as J, F, and G. Experiments to apply different machine learning and deep learning algorithms to emotion datasets were also conducted.
e results show that the proposed Bi-LSTM model produced improved results in terms of improved accuracy (87.66%), precision (87.66%), recall (87.66%), and F1-score (87.66%) with respect to the compared studies. e following are some potential constraints on the proposed work: (1) We perform the emotional classification of text content (2) e research is limited to random word embedding, with no use of word representation models such as Glove, FastText, or word2vec (3) Emotions are not classified using other configurations of deep learning techniques (4) e current study's content is solely focused on the ISEAR and Twitter emotion datasets (5) In the work, limited emotion clues are exploited (6) e current research does not mention emotion intensities such as strong negative, strong positive, weak negative, and weak positive, which must be discussed to make the system more effective (7) e research is limited to texts written in English (8) AUC, density, and error rate can be used to better estimate the classifier's performance; however, the proposed study topic is confined to performance metrics like accuracy, preciseness, recall, and F1score (9) A modest number of machine learning models are currently being used for testing, but that can be extended 6. Future Directions e following is a list of probable future options for the research work: (1) Photographs and videos may be used to expand the work (2) Other previously trained schemes such as Glove, word2vec, and FastText can be used for word embedding layer in future work  Total  MNB correct classification  150  10  160  MNB misclassification  19  40  59  Total  169  50  219 With one degree of freedom, we observed that the 2-tailed p value was 0.137 with a chi-square statistic of 2.22. As a result, the null hypothesis is discarded in favor of an alternative hypothesis. 10 Complexity (3) For emotional classification, we will investigate the various combinations of deep neural networks (4) Exploring additional emotional classification tasks in different data sets (5) One potential plan in the future is to increase the performance of the research project by extending the different emotion clues (6) e research will be expanded to include other languages to test the efficacy of the proposed model in other languages (7) e problem of long computation times can be addressed using GPU, which allows for large-scale dataset experiments (8) Using a combination of deep neural network systems to solve the problem of emotion classification would be more efficient. As a result, we will use other neural networks in the future (9) In the future, we will put more emphasis on using ensemble approaches to improve device efficiency (10) We will focus on adding more base models and looking for other criteria that might further improve the overall accuracy of the proposed work (11) In future studies, we will look at combining our proposed approach with other NLP strategies such as part of speech and marking to achieve better results for NLP problems

Data Availability
Two publically available datasets (ISEAR and SemEval2007) were used to support the findings of this study. e two datasets were cited and included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.