Sentiment Prediction of Textual Data Using Hybrid ConvBidirectional-LSTM Model

With the emergence of social media platforms, most people have changed their way of interacting. Perhaps, sharing day-to-day lifestyle updates is a trend substantially influenced by microblogging sites, specifically Twitter, Facebook, Instagram, and many more. Moreover, text and messages are the most preferred way for such interactions. Twitter is one of the most commonly used microblogging tools that enable people to express their thoughts, opinions, emotions, happiness, sadness, excitement, ideas, mental stress, and so on. Hence, the sentiment prediction furnished by such textual data becomes a complex and challenging task. In this research, the authors proposed a hybridization of the convolutional neural network and bi-directional long short-term memory model (named ConvBidirectional-LSTM), which aims to better the categorization of sentiments of text data. Then, this proposed hybrid ConvBidirectional-LSTM model is compared with the existing state-of-the-art models, GloVe-based CNN-LSTM and Hierarchical Bi-LSTM (HeBiLSTM) models models. Furthermore, the performance of the proposed hybrid Con-vBidirectional-LSTM model is evaluated on the US airline dataset using various performance parameters like accuracy, precision, recall, and f 1 score. The proposed model outperformed the existing state-of-the-art models with an accuracy rate of 93.25% in sentiment prediction.


Introduction
e tremendous growth of Internet utilization, particularly by microblogging websites, has resulted in the generation of a signi cant volume of textual information conveying peoples' choices, thoughts, and emotions. is textual information is signi cantly helpful and might be used by businesses, governments, and others to make choices; evaluation tools that can extract necessary knowledge from it and categorize it are depend on its polarity. is problem is explored in the area of sentiment classi cation, which is a branch of computational linguistics [1]. e microblogging movement emerged due to the fast evolution of information and communication technologies, which has increased online media users around the globe. Web 2.0's major characteristics include collaborative information exchange and users on a digital site, resulting in massive volumes of unstructured material on various themes. As a result, microblogging sites became necessary to share connections and business interactions. e increasing use of social media platforms like Twitter, Link-edIn, Facebook, and consumer reviews media has sparked interest in the Microblogging era. With the rapid growth of social mass communication and textual content, sentiment classi cation employing customer reviews attracts enormous interest from various organizations (e.g., commercial and academic) [2]. In the cyberspace, still, a variety of research is prevailing using data mining approaches for social networking text data; signi cant examples include connectivity, material, and customer data [3]. One information that has become a substantial focus of recent studies is detecting individuals' opinions in blog articles about a particular topic, referred to as sentiment prediction. In this research work, three sentiment labels, positive, negative, and neutral, are used. Twitter, a prominent social network site, facilitates users with a tool that any individual can send and receive short text messages. A distinctive feature of Twitter allows it to appeal to businesses, including its transparency, its word restriction on uploaded posts, and the widespread utilization of hashtags. Even though most social media platforms need two members to be connected before they can even see anyone's posts, Twitter enables people to see each other's posts even if they do not know each other; it is simple to gather information [4].
1.1. Sentiment Analysis. Every tweet contains either a positive, negative, or neutral sentiment. e sentiment of the user can be determined using the sentiment score thatis calculated based on the positive and negative words in a tweet, as shown in the following equation [5]: sentiment score � (P − N) (P + N + 2) . (1) Here, P and N define the total count of positive and negative words in a tweet, respectively. e sentiment score is represented using a discrete 2-valued variable S that represents the sentiment class: All the sentiment score values and the differences between them are captured by the variable S. In some cases, the polarity value fails to identify the degree of sentimentality from the textual data because, in some instances, the negative and positive sentiment scores cancel each other which results in a zero sentiment score, i.e., (sentiment score � 0). ough the textual data from the tweet are positive or negative and not neutral, the zero sentiment score results in false data. Hence, the following constraints are followed to identify positive and negative tweets.
When the tweets are provided as input to the system, the polarity of that tweet is calculated to identify whether the given tweet is positive, negative, or neutral.

Applications of Sentiment
Classification. Some of the prominent applications of sentiment classification are given in the section as follows: (i) Mainly used for classifying sentences, paragraphs, and documents into positive, negative, or neutral labels. (ii) Used in commercial applications such as multimedia systems for writing movie reviews [6], news articles, restaurant reviews, mobile customer reviews [7], real-time insights, etc. [8]. (iii) Used to extract meaning from the sentence, classification of intent, and linguistics-based emotion analysis.
(iv) Used by product manufacturing companies for obtaining accurate product reviews based on customer ratings. (v) Used in some of the fields where sentiment prediction is promptly needed like Emotion Detection (E.D.), Building Resources (B.R.), Transfer Learning (T.L.), etc. (vi) Used in creating artificial datasets by utilizing semisupervised machine learning algorithms.

1.2.
Motivation. Sentiment prediction from textual data has gained vast significance. e prominent attributes procured from sentiment analysis can be used in decision-making, psychological processes, opinion collection for political promotion, product marketing, and so on. Due to enormous textual data generation, social media platforms have become prominent data sources for sentiment prediction-related works. Primarily, microblogging sites such as Twitter are widely used to collect people's opinions and views in "tweets" that have a maximum length of 140 characters. e anonymity of Twitter makes it easy for people to express their original sentiments on the microblogging site. Several approaches are being used to analyze text sentiments from Twitter data. Previous methods used for sentiment prediction are based on sentiment lexicons and need to restrict themselves to external resources or manual preprocessing for complex feature analysis. e anomalies related to the existing approaches motivated the authors to propose an enhanced deep neural network model that can extract different sentimental features from textual data and can accurately predict people's sentiments.

Research
Objectives. e primary objective of this proposed work is to develop an effective sentiment prediction model based on textual data to analyze different sentiments.
e proposed approach comprises a hybrid ConvBidirectional-LSTM model for extracting both word sequence and word semantic features for sentiment prediction.
e main objective of the proposed research work is mentioned in the following section: is research article aims to improve the accuracy of the previous work carried out by different researchers [9][10][11][12][13]. e authors proposed a hybrid ConvBidirectional-LSTM approach for analyzing different sentiments with the US airline dataset and predicting the polarity of the text sentiments. e ConvBidirectional-LSTM is the hybridization of the CNN and Bi-LSTM to learn complex contextual features and semantic information from the airline Twitter data. e experimental analysis lies on the real dataset collected from Twitter used for predicting positive, negative, and neutral sentiments. To portray the Twitter posts in the format of integer data or arrays, a pretrained GloVe (https://github.com/ stanfordnlp/GloVe accessed on 26 February 2022) word embedding approach will be employed. is approach is pretrained unlabeled word matrices that could preserve word meaning and learn with a massive set of words. is research will use a GloVe embedding vector approach to examine the quality of the proposed framework.
e results of the ConvBidirectional-LSTM model will be compared with the preexisting state-of-the-art models, experimental GloVe-based CNN-LSTM, and HeBiLSTM models [14,15] to verify the efficacy of the proposed framework.
is research paper contains related research background in Section 2, the proposed hybrid ConvBidirectional-LSTM for sentiment prediction in Section 3, research experiment in Section 4, and results and discussion in Section 5, and finally the research work is concluded in Section 6.

Background
is section reviews the related work and existing models used for sentiment analysis and also introduces word embedding, convolutional neural networks, and bidirectional-LSTM approaches.

Related
Research. Sentiment classification, widely called opinion mining, draws the attention of the customers using text mining and NLP approaches [16]; research has found that people potential, worker surveillance, and real-time insights [17] are all advantages of sentiment classification [18]. Network operators can use sentiment classification to find out what type of services they are missing and what areas of their existing customers are happy [19]. Sentiment classification operates by distinguishing positive and negative thoughts inside text evaluations, which can be quite difficult to recognize inside delicate wordplays [20]. Table 1 lists several related sentiment classification research studies.
Basiri et al. [9] suggested an attention-based bidirectional CNN-RNN deep model (ABCDM) with eight different datasets such as App, Movies, Kindle, US-airline, Electronics, CDs, Sentiment140, and T4SA. e ABCD Model uses two BiLSTM and GRU phases to retrieve future and past contents in both directions. In addition, the attention mechanism is used on the outcomes of the bidirectional stages of ABCDM to focus on particular terms. erefore, the max-pooling phase is utilized to minimize feature dimensionality by extracting contextual information. Further, the ABCDM model was compared with state-of-the-art models and with attention-based CNN and BiLSTM (AC-BiLSTM), SS-BED, HAN, ARC, CRNN, and IWV.
is research article considered positive and negative sentiments for their experiments. Jain et al., [11] proposed a CNN-LSTM model for sentiment analysis of the US airline and US airline quality dataset. is research only considered positive and negative sentiments for their experiments and achieved 91.3% accuracy with CNN-LSTM on the US airline dataset.
is paper compares the proposed CNN-LSTM model with existing machine algorithms such as Support Vector Classifier (SVC), Decision Tree (D.T.), Logistic Regression (L.R.), and Naive Bayes (N.B.), CNN, and LSTM. Umer et al. [12] provides a deep learning (DL) network that combines a CNN and LSTM for sentiment prediction on Twitter data. And then, this CNN-LSTM model was compared with existing ML algorithms , specifically with the SVC, Random Forest (R.F.), L.R., SGD, a Voting Classifier of SGD, and R.F. Moreover, the sentiment prediction performance is examined with two influential text extracting features, TF-IDF and word2vec. e authors evaluated the overall performance of the proposed method using three different datasets like the US airline, hate speech, and women's e-commerce clothing reviews. Sezgen et al., [27] used the Latent Semantic Analysis (LSA) [28] text mining technique for sentiment analysis [29]. For that very purpose, they collected 2,536 negative and 2,584 positive airline reviews data from TripAdvisor.com. e study is focused on examining the fundamental factors that determine passenger dissatisfaction and satisfaction and the variations among airlines' business strategies. e intrinsic shortcoming of bag-of-words data analysis is that the LSA classifier does not evaluate sentence-level certain text content deriving from a grammatical structure. Xu et al. [30] suggested a BiLSTM-based sentiments data analysis for posts and used it to address the problem of post-sentiment classification. e opinion knowledge participation intensity is incorporated into the TF-IDF technique of phrase strength calculation. A group-led approach of basis vectors based on modified phrase strength calculation is suggested in response to the shortcomings of a phrase developed mathematically in recent studies. Furthermore, the BiLSTM captures all semantic features and can generate decent writing of the remarks. is research paper compared its proposed BiLSTM architecture with existing Naive Bayesian, CNN [31], RNN, and LSTM [32]. Lastly, the opinion characteristic of a message is determined using a co-evolutionary network with SoftMax maps. e correctness of the suggested word representing approach in this research is demonstrated through experimentation with various word representations.

Word
Here, f is the activation function, w is the weight parameter of the convolution kernel, and b is the bias. e text characteristics map-based vector D d 1 , d 2 , . . . , d m−n+1 collected using convolution is then pooled, and also the study employs the max-pooling approach. e formula is written as follows: e previous ndings were achieved using convolution and max-pooling operations on a single convolution kernel. e following are the ndings for c convolution kernels: After successive convolution and max-pooling layers, the fully connected layer is sent as a complete connection. And nally, this completely linked layer is mapped with a tagged sampling eld, and characteristics are merged. Figure 1 depicts the fundamental structure of the bidirectional LSTM [46]. e Bi-LSTM replicates the primary, recurrent layer in the architecture so that two layers are formed side-by-side.

Bidirectional LSTM.
e Bi-LSTM provides the input sequence to the rst layer as it is and provides the second layer with a reversed copy of the input sequence. In each iteration, it records the last words stored in its memory unit and evaluates the probability of the next word [47]. For each word stored in the library, the Bi-LSTM allocates a likelihood based on past terms and identi es the word that holds the higher probability, and in its memory, that word is stored. e exemplary memory of the Bi-LSTM makes it suitable for language generation as it remembers the background of the conversation at any moment. e limitation of the traditional Recurrent Neural Networks (RNN) approach to storing long-length word sequences is overcome by the Bi-LSTM. A four-layer neural network is a Bi-LSTM model, and each LSTM's memory unit consists of three gates: the input, output, and forget gates, as shown in Figure 2.
ese gates allow the model to either retain or forget words at any moment by regulating the ow of data through that gate. is enables the Bi-LSTM model to track only relevant data. is reduces the issue of data disappearing gradient that helps the system remember data stored for a longer time. e cell C k runs through the network, and the LSTM gates, such as the input gate and output gate, control the ow of data through the Bi-LSTM via the sigmoid function [48]. When the value of the sigmoid activation function [49] is "1," the data are ultimately passed through the gates, whereas if the value of the sigmoid function is "0," the information is not allowed by the entrance. e amount of data that has to be passed through is decided by the forget gate, as de ned in [49,50] Here, σ is the sigmoid activation function, W f is the input word sequence, h k−1 is the previous state of the forget gate, and b f represents the bottleneck features. Once the data are controlled by the forget gate, the input gate controls the new data that will be retained in cell state C k . e formula for the input gate is e current state of the cell and the memory unit status of the cell are obtained from equations (9) and (10), respectively.
Lastly, the output gate in the Bi-LSTM will be used for controlling the output of the sigmoid function as shown in equation (11) and the hidden layer output is de ned according to equation (10).
e terms b and W represent the bias value and the weight coe cient matrix, respectively; tanh is de ned as the hyperbolic tangent activation function.
e Bi-LSTM consists of an additional layer of reverse LSTM (Backward LSTM) that reverses the ow of information. e hidden layer synthesizes the forward and backward information. Doing so ensures that every cell in the LSTM can obtain the context information. e reverse or the backward layer of the Bi-LSTM is evaluated similarly to that of the forward LSTM. However, the direction of the information ow is reversed to acquire the following information at a particular time. e forward and backward information ow in the Bi-LSTM network is illustrated in the following equations [51]: where h f and h b are the outputs of the forward and backward LSTM, respectively. And the nal output of the hidden layer is given as

Proposed Hybrid ConvBidirectional-LSTM for Sentiment Prediction
is research uses the ConvBidirectional-LSTM model to construct a revolutionary method for improving sentiment categorization on Twitter messages. is proposed approach integrates the CNN and BiLSTM neural networks. We used this hybridization to see how well the CNN might respond to Bidirectional LSTM, renowned for its e cacy in sentiment prediction. e most signi cant advantage of this proposed approach is that it enables a large amount of data to be extracted successfully. In the rst stage of the model, the word embedding matrix is processed through the GloVe embedding technique, and this word vector is then provided as an input to the CNN. In the next stage, the convolutional layer extracts the relevant features. en, the dimensionality of the feature space for each input text is reduced using the max-pool approach. At last, the features vector is created in the fully linked layers for feature integration. en, it would be passed to Bidirectional LSTM to get the contextual information about the textual data, which signi cantly enhances the sentiment classi cation accuracy of textual data. e suggested ConvBidirectional-LSTM model is described in six subsections as depicted in Figure 3.

Word Vectorization.
e proposed model takes the text data input in this layer and then splits it into words or tokens. Every word is turned into a numerical integer matrix. e vector of numerical values was produced using the GloVe pretrained word embedding algorithm. e GloVe technique is utilized separately to assess the performance of the model. If every text of m words is expressed as V w 1 , w 2 , . . . , w m , then every word is transformed into an n-dimensional vector representation, and the text input is speci ed as V w 1 , w 2 , . . . , w m ∈ R m * n . (15) Due to the variant size of the input vector, the size of the text content s used in the proposed model must be equal. If the text content is smaller than s, then the text content size gets increased by applying a zero-padding technique, and if Mobile Information Systems the text content is larger than the speci ed size s, it will then be shortened; as a result, every text content has the same vector size. e representation of each text content of s dimension is as given in [52] V w 1 , w 2 , . . . , w m ∈ R s * n .

Convolutional Layer.
e convolutional layer is an essential stage of the CNN architecture, transcendent for retrieving the relevant features from the Twitter text [53]. For convolution operation, one-dimensional convolutional layer receives the text vector array V ∈ R s * n from the GloVe embedding layer. To create the feature space of n-gram in a 1-dimensional convolutional layer, an array matrix is being computed from convolutional text using M ltration and the size p of a convolutional kernel. e lters F m , where 1 ≤ m ≤ M, produce a feature space depicted in the following equation: where f is the nonlinear activation function, and in this research the swish [54] activation function is used; F m is the weight vector of lter expressed as w ∈ R s * n ; b m is the bias of lter F m ; n is the word matrix dimension size; and the convolutional operation is represented as ⊗ ; Y j:j+p−1 that de nes lter F m retrieves feature Y j:j+p−1 from Y j , and the result of the feature space of lter F m is d m j ; here, j th is the component of d m . e features associated with the text of size s are as in

Max-Pooling
Layer. e convolutional process generates characteristic maps, and thereafter the max-pooling layer collects the signi cant characteristics d max d { } to compute the local appropriate statistics. One-dimensional max-pooler decreases the dimensions of its source by converting every kernel size into a single result of the highest limit. us, the CNN architecture can e ciently minimize the number of features to avoid over-tting problems while simultaneously reducing run-time and parametric complexities.

Bi-LSTM Layer.
Unlike the LSTM classi er, Bidirectional-LSTM includes two hidden states that allow data to be transferred in both paths forward-to-backward (h f ) and backward-to-forward (h b ). It also helps Bidirectional-LSTM to understand the situation entirely. All source data that comprise both past and future values would be kept using both paths, whereas the typical RNN framework declined to include future trends. e basic implementation of Bidirectional-LSTM is to link two di erent ways of an LSTM model to a single output. A forward LSTM phase obtains past data, while the reverse LSTM phase obtains future data.
is architecture supports the system in retaining past and future data. In Bidirectional-LSTM, the sequential result of the rst phase seems to be the input for the next step, while the sequence outcomes of the next step are concatenated with the nal unit result of the backward and forward actions. e resulting outcome h after stacked Bidirectional-LSTM layers is shown as in 3.5. Dense Layer. e proposed framework incorporates a dense network layer to link each source input with every outcome by utilizing its weights. Softmax as an activation function is used inside the last layer to obtain the nal result. Softmax will take the mean value of random outcomes into 0, 1, and 2 forms. Equation (20) represents the predicted outcome of the softmax function. e output of negative, neutral, and positive sentiments is labeled with 0, 1, and 2, respectively, using categorical cross-entropy.
3.6. Regularisation. Deep learning visualizes over tting as a most challenging problem. e data seem to be trained e ciently by the model, but failure is observed in the case of unseen data generalization. e regularisation technique can easily avoid over-tting problems. A regularisation is an approach to improving the generalization ability by making minor changes to the training algorithm. e method adds more prediction models throughout the training to decrease its complexities and avoid over-fitting. Dropout and L2 are the most often used regularisation techniques. L2 regularisation, often referred to as weight decaying or ridge regression, penalizes its loss function by adding the square intensity of the parameter as a punishment. A dropout is an approach for avoiding over-fitting and generalizing the system that involves periodically dropping a component off (both hidden and shown) throughout learning. e architecture of the ConvBidirectional-LSTM model reflects four dropout levels between the word vector matrix and the convolutional, max-pooling, and Bi-LSTM layers, and before and after the dense phase, with a dropout rate of 30%.

Research Experiment
Experiments during the research were conducted to test the ConvBidirectional-LSTM text sentiment categorization classifier with the US airline sentiment. is section comprises detailed descriptions of the dataset, data preprocessing, experimental setup, hyper-parameters setting, and performance evaluation metrics.

Dataset.
e US airline sentiment dataset has a total number of 14,640 rows and 15 columns; in the underlying research work, only two columns are being considered for the analysis. e first column contains a sentiment label, and another contains a text review of the passengers. In this dataset, labels are categorized into negative, positive, and neutral sentiments with 9178 rows as negative, 3099 as neutral, and 2363 as positive sentiments. e proposed model is trained and tested with 9516 and 5124 data rows. is US airline sentiment dataset is available at Kaggle (https://www. kaggle.com/welkin10/), and also available at Crowd Flower data library (https://www.data.world/crowdflower/). e proportion of sentiments for each airline contained in the US airline dataset [55] is illustrated in Table 2.

Experimental Setup.
Deep learning models can be developed using various methods, tools, and packages. In the proposed research work, Keras [56] is used; it is known to be one of the best tools with TensorFlow as the backend. e research experiments are executed in Google collaboratory and Jupiter notebook on Microsoft Windows 11 P C. with AMD Ryzen3-3250U/Radeon Graphics processor (2.60 GHz) and 8 GB RAM as hardware and software support. To closely correlate the experimental outcomes of the proposed model with other existing state-of-the-art models, the accuracy value is used as a primary performance indicator. Additional performance indicators, including recall, precision, and F 1 -measure, are also tested to measure the proposed classifier.

Data Preprocessing.
To select the essential features from the text data for sentiment prediction, data preprocessing is considered the initial phase of the experiment. Since most of the text data contain a mix of misspelled words, parts-ofspeech (POS) tagging, slang, exclamation marks, acronyms, punctuation marks, etcthehe key objective of the preprocessing stage is to remove excessive textual information, conflict, distortions, and inconsistency. In most of the textual data, it has been observed that the text data may be present as straightforward, distorted, inconsistent, and doubtful. erefore, textual data preprocessing is required to evaluate the similarities and keep them in a format suitable for further investigation. In the underlined research work, the authors have applied the data preprocessing phase to remove stop words and signatures of the airlines, preserving some essential words of the text data such as "nor," "not," "no," etc., which are prone to reflect negative sentiments. en, stemming and lemmatization functions are applied to that textual data. Further, regular expression (re) was used to correct word repetition and data cleaning to remove username, punctuation, HTML, emoji, and URLs. Furthermore, all the text data have been changed to lowercase to make the text dataset more universal. Finally, the data preprocessing stage tokenized all the text data to particular utterances, and 12041 unique tokens have been retrieved from the US airline sentiment text dataset. Table 3 shows the values of the hyperparameters that are used in the proposed Con-vBidirectional-LSTM model.

Evaluation Metrics.
e performance evaluation indicators listed below are the most important criteria for measuring the performance of the proposed approach.
where Pre, Rec, Acc, TRN, TRP, FLN, and FLP are precision, recall, accuracy, true negative, true positive, false negative, and false positive.

Results and Discussion
e proposed ConvBidirectional-LSTM framework has experimented with the optimal hyper-parameters setting and compared with the existing state-of-the-art models, GloVe-based CNN-LSTM and HeBiLSTM models. Table 4 shows the overall accuracy, precision, recall, and F 1 score of di erent deep learning models. e performance accuracy of the proposed ConvBidirectional-LSTM framework was 93.25%, whereas GloVe-CNN-LSTM and GloVe-HeBiLSTM models' performance accuracy was 92.79% and 92.47%, respectively. e observed performance accuracy re ects that the proposed framework outperforms the GloVe-based CNN-LSTM and HeBiLSTM models. e 3.21% training and 22.07% validation loss of the ConvBidirectional-LSTM framework in 15 epochs are as shown in Figure 4.
Similarly, the 99.47% training and 93.25% validation accuracy of the proposed framework in 15 epochs are as shown in Figure 5. e confusion matrix and the ROC curve of the proposed ConvBidirectional-LSTM framework are depicted in Figures 6 and 7. Table 5 shows that Wen and Li [13] proposed three variants of the hybrid RNN and CNN model: Attention Recurrent Convolutional (ARC), Recurrent Convolutional (R.C.), and Multiple Attention Recurrent Convolutional (M_ARC) models. e evaluated performance accuracy of the ARC, R.C., and M_ARC models was 83.10%, 83.20%, and   83.30%, respectively, on the US airline dataset for positive, negative, and neutral sentiments. Jain et al. [11] proposed the CNN-LSTM model on the US airline dataset for classifying positive and negative opinions having a performance accuracy of 91.30% and compared the proposed CNN-LSTM model with other machine learning models. Basiri et al. [9] proposed the attention-based bidirectional CNN-RNN deep model (ABCDM) for analyzing positive and negative sentiments on the US airline sentiment dataset with a performance accuracy of 92.75%. Table 5 illustrates the remarkable accuracy of 93.25% achieved by the proposed ConvBidirectional-LSTM framework on the US airline dataset for predicting positive, negative, and neutral sentiments. e overall result shows that the proposed ConvBidirectional-LSTM model is better than the existing state-of-the-art models.

Conclusion
e proposed sentiment analysis approach aimed to develop an e cient deep learning-based sentiment prediction model. is model analyzes the users' sentiments using textual data from social media platforms. e data for experimental evaluation were collected from the microblogging site Twitter. is research article suggested a ConvBidirectional-LSTM model for sentiment prediction.
e sentiment prediction has experimented with three sentiment labels: positive, negative, and neutral. Although earlier works of literature only considered two, positive and negative, labels for their sentiment analysis, the performance of the proposed approach was evaluated by di erent performance metrics such as F 1 score, recall, accuracy, and precision. Further, the proposed ConvBidirectional-LSTM method was compared with the existing state-of-the-art models, GloVebased CNN-LSTM and HeBiLSTM models.
Although this research work was carried out on an airline dataset that contains textual data in the English language, the authors are looking to build their proposed model to use other languages in future work. e architecture of the ConvBidirectional-LSTM framework could also be improved to enhance sentiment classi cation accuracy by implementing other hybrid deep learning models.
Data Availability e dataset was taken from online digital libraries like Kaggle and CrowdFlower. e authors have included the link also in the manuscript.

Conflicts of Interest
e authors declare that they have no con icts of interest.