An Efficient AP-ANN-Based Multimethod Fusion Model to Detect Stress through EEG Signal Analysis

Stress is a universal emotion that every human experiences daily. Psychologists say stress may lead to heart attack, depression, hypertension, strokes, or even sudden death. Many technical explorations like stress detection through facial expression, speech, text, physical behaviors, etc., were explored, but no consensus has been reached on the best method. The advancement in biomedical engineering yielded a rapid development of electroencephalogram (EEG) signal analysis that has inspired the idea of a multimethod fusion approach for the first time which employs multiple techniques such as discrete wavelet transform (DWT) for de-noising, adaptive synthetic sampling (ADASYN) for class balancing, and affinity propagation (AP) as a stratified sampling model along with the artificial neural network (ANN) as the classifier model for human emotion classification. From the EEG recordings of the DEAP dataset, the artifacts are removed, the signal is decomposed using a DWT, and features are extracted and fused to form the feature vector. As the dataset is high-dimensional, feature selection is done and ADASYN is used to address the imbalance of classes resulting in large-scale data. The innovative idea of the proposed system is to perform sampling using affinity propagation as a stratified sampling-based clustering algorithm as it determines the number of representative samples automatically which makes it superior to the K-Means, K-Medoid, that requires the K-value. Those samples are used as inputs to various classification models, the comparison of the AP-ANN, AP-SVM, and AP-RF is done, and their most important five performance metrics such as accuracy, precision, recall, F1-score, and specificity were compared. From our experiment, the AP-ANN model provides better accuracy of 86.8% and greater precision of 85.7%, a higher F1 score of 84.9%, a recall rate of 84.1%, and a specificity value of 89.2% which altogether provides better results than the other existing algorithms.


Introduction
Stress is a major problem experienced by humans in their daily life. Stress is defned as the way by which the body responds to a situation or threats. In the present situation with COVID-19 completely ruling the world, chronic stress has become very common among people, as the survey tells more than 70% of Americans experience stress regularly [1]. Te most dangerous truth about stress is that people easily attain it being unaware of its efect on them. Tere are lots of causes for a person to be under stress in the current decade. Te condition along with the situation that causes stress is generally named stressors that have a major infuence on mood, health, and behavior [2]. Te stressors can be work stress which is caused by a heavy workload, more responsibility, and risk of termination, or life stress, which is caused by unemployment, death of a person, and illness.
Stress cannot only be caused by external factors but also by internal such as constantly worrying about things that happen around. Stress is individual-specifc as the amount of stress a person can tolerate varies due to many reasons [3]. Chronic stress occurs when a person experiences continuous stress without any relief. Te stress which hurts humans is referred to as negative stress [4], and it may lead to various physical imbalances which include headache, high blood pressure, stroke, and heart attack and emotional imbalances which include depression, anxiety, hypertension, and fear. Sometimes, the stress may even lead to death. Hence, there is a primary need to detect the stress in the early stage and manage it through appropriate measures.
Tere are various methods in existence to analyze stress which includes analysis of stress in voice [5], detection of stress using image processing which is a system that detects stress by analyzing facial expression [6], and analysis of human stress by investigating mobile phones that is a design of collecting of information or data from smartphones, surveys, and call logs [7]. Te traditional methods using the EEG signals have numerous drawbacks which include external factors such as sweating, room temperature, and invasive procedure. Terefore, there is a need for a method that is precise, accurate, noninvasive, and reliable. Te proposed work aims in creating an EEG-based stress analyst, a system used to detect human stress levels using a noninvasive brain-computer interface.
Tere are numerous techniques proposed to detect stress from negative emotions like sadness and anger that were detected by classifying the EEG signals, and this method of using the EEG signals to detect emotions gives a promising result [8,9]. Among all the noninvasive techniques [10] to determine brain activities, an EEG-based methodology was found to be best with a low setup cost. It measures the brain's electrical activity directly from electrodes that were laid on the scalp of the brain [11]. EEG measures the minute electrical diference produced by neurons using the electrodes and sends signals to the external device. With the enhancement in technology, massive development is attained in wearable systems that can record electrophysiological signals to detect acute stress [12]. Te EEG signals are categorized into the number of sub-bands of diferent frequencies, and diferent brain states can be analyzed from each frequency band [13]. Stress is detected by classifying emotions using the machine learning algorithms from the recorded EEG signals [14].
Since the dataset is of high dimensional with n number of features, an efcient feature selection method is needed to minimise the features that do not contribute much to the classifed result. Tis is done with the help of the Pearson correlation coefcient (PCC) method in which the correlation between a feature-to-feature and feature-to-class are calculated. Ten, the correlations between features are ranked in decreasing order. Ten, the frst feature is selected, and the feature set is expanded by adding the next feature in the order [21]. Ten, the process is continued until there is no improvement and the best feature set is received. From the feature set, a range value is calculated, the features not in the range are eliminated, and the top 10 features alone are being used for the further process [22,23].
Te ADASYN algorithm is used in the proposed model to handle the class balance of the DEAP dataset by additionally creating new samples from the minority class [24,25]. Te traditional data mining algorithms sufer from computational defciency, and sampling is an efective data reduction technique to reduce the computational cost and speed with high efciency. Among the various random sampling methods, the stratifed sampling technique suits the need of this work of dividing the available dataset into various strata and picking a random item from each group as items in a stratum will have common characteristics [26]. Tis sampling method is widely used in human research. Afnity propagation (AP) does not require mentioning the number of clusters to be formed as they were formed through message passing and so the exemplars computed from AP are the representatives of all other data points in the cluster. Tese exemplars are used to train the model [27].
Many research activities were carried out to classify emotions in the given dataset with diferent performance evaluators such as minimum error, precision, f-score, accuracy, and p value using many classifers [28,29]. In our research, the artifcial neural network (ANN) is used for the classifcation of the EEG data as its results are more promising than the existing classifcation algorithms [30] and afnity propagation-based stratifed sampling methodology along with the ANN suggested in this study was compared with the support vector machine (SVM) and random forest (RF) through various metrics.
Te classical EEG signal classifcation as in various research papers involves the following steps: signal preprocessing->feature extraction->classifcation. But those algorithms failed to work efciently as many other important concerns such as high-dimensional data, class imbalanced nature of the DEAP dataset, and the computation cost involved in training the classifer were not considered. Tough many of these concerns were addressed separately in different research articles, there is no work on the hybrid model. In our proposed work, those limitations are addressed through suitable techniques and a multimethod hybrid model is proposed. Te workfow of the proposed model works is as follows: signal preprocessing ->feature extraction ->feature selection ->class balancing -->stratifed sampling->classifcation. Te most highlighted innovative point of this research work is the usage of afnity propagation, a clustering algorithm as a stratifed sampler. Representative samples selected through this method have more resemblance to the population than traditional sampling algorithms.
Te key contributions of this research work are listed as follows: (iv) DEAP dataset sufers from the serious drawback of overftting due to imbalanced classes and to overcome it, an adaptive synthetic sampling approach (ADASYN) is used that difers from SMOTE which fails to consider lower density areas when upsampling minority classes.
(v) To improve the performance and reduce the computation cost involved in training with very large data, stratifed sampling is performed and representative elements were involved in training the classifer. Among the various sampling techniques, better coverage of the population is achieved by stratifed sampling as the researchers can ensure that all of them are represented in the sampling. Te general steps of the cluster-based sampling method involve some sampling scheme to decide on the number of representative samples and later applying the clustering algorithm, clusters are formed. Te existing system increases the sampling complexity as the number of representative samples must be explicitly defned before clustering as that of the K-means and K-mediods and there is a requirement to ft the left-out sample objects. To overcome this, the afnity propagation (AP) is used which forms clusters by passing messages among the data points and the exemplars of the fnal iteration are taken as representative samples. Tis increases the efciency of stratifed sampling on large data and a comparison of the performance of AP with K-means and K-mediods is done in this study.
(vi) Finally, experimental studies were conducted on three machine learning algorithms: SVM, ANN, and RF. Extensive experiments show that the fusion model of an AP-based sampler with the ANN model outperforms the state-of-the-art models.

Structure and Literature Review
Te structure of this paper is such that Section 2 comprises other authors' contributions related to this research work and Section 3 includes the modular structure of the AP-ANN framework. In Section 4, the proposed model is implemented and its results are discussed, and Section 5 presents the detailed summary of the research work with limitations and future enhancements were suggested. Recognizing stress from the EEG signals is an interesting research topic for the past few years due to the increase in patients with depression, and there was a continuous urge to fnd a technological solution for it. Tere were many research works carried out trying to improve the output of the classifed result. All the below-mentioned papers make use of the DEAP dataset for their research work proposing various feature extraction, feature selection, and emotion classifcation techniques. Giuseppe Placidi et al. [31] proposed the classifcation of emotions using the DEAP dataset. From these participants, the relaxing phase EEG signals were obtained. Te signals were decomposed using wavelet decomposition to approximation and detailed coefcients. Te SVM classifer was used on the features extracted using the principal component analysis (PCA). Abeer Al-Nafjan et al. [32] proposed two emotional models of which the dimensional emotion model was used for emotion recognition which includes valence and arousal relation. Te deep neural network and random forest classifers were used to classify emotions, the feature extraction used time-frequency features and frontal asymmetry features, and results show that the DNN performs better than the random forest.
Jingxin Liu et al. [33] in their suggested model extracted the time, frequency, time-frequency, and wavelet domainbased features, and the mRMR algorithm was used for feature selection. Te classifcation algorithms used were the random forest and KNN. Sukriye Kara and Ergin [34] used the DWT technique as a preprocessing algorithm, and the SVM was used as a classifcation algorithm. Te features such as entropy, energy, and the standard deviation were computed. Te diferent pairs of features were used for training. Te energy feature with the SVM classifcation algorithm showed good accuracy in the detection of epilepsy. Sachin Borse [35] in their suggested model used the ICA and DWT for de-noising the EEG signals. Te DWT decomposes the signal and applies thresholding to the decomposed signals. Te ICA transferred the input signal into independent components and rejected the component with more noise. Te whitening process was done before doing the ICA process to make the input signals uncorrelated.
Prashant Lahane and Tirugnanam [36] used Teager-Kaiser energy operator for the feature extraction, and classifcation tree, the K-nearest neighbor, and the neural network classifers were implemented with the conclusion that the TKEO gives better accuracy than kernel density estimation and relative energy. Princy et al. [37] explained the statistical method for artifact removal from the EEG signals using the wavelet transform technique. Te wavelet Computational Intelligence and Neuroscience transform method analyzes the signals with low noise amplitudes so that they could be removed from the original signals by selecting the best wavelet to decompose the signal. Te removal of artifacts from the EEG signal using the wavelet transforms was done by detecting its spikes without taking into the consideration of signal-to-noise ratio. Gaikwad [38] in their paper analyzed the efects of stress, and a methodology to detect the stress using the EEG signals was discussed. Te phases involved in capturing the real-time signals from the NeuroSky Mind wave kit were explained, and the Fourier transform (FFT) was used as a preprocessing algorithm. Te eSense meter, an analysis method, was used to convey if the user was in stress mode or without stress mode efectively. Bhuvaneswari and Satheesh Kumar [39] identifed that the SVM machine kernel was used to classify the positive and negative values of arousal and valence.
Wolpaw et al. [40,41] proposed the brain-computer interface methodology for providing communication capabilities for people who were sufering from neuromuscular disorders, and the diferentiation of dependent BCI from independent BCI was made. Abin et al. [42] proposed a smart home environment adjustment system that was based on the EEG and IoT technology. Te proposed system detected the cognitive state of the person (alert or drowsy), and based on it, it controls the devices in the environment. Ankita Tiwari [43] explained the usage of Lab VIEW for stress management using BCI. Te NeuroSky Mind wave sensor was used in the system for the acquisition of the signal from the human brain, and their proposed system also includes an android application that helped to reduce stress by suggesting yoga and music after getting the SMS. Tejaswini et al. [44] reviewed two publicly available datasets (DEAP and SEED) that used the DWT for feature extraction and the SVM for classifcation and obtained the fnal output by channel fusion.
A detailed study on feature selection and feature extraction methods was performed for diferent datasets. Khan et al. in their paper [45] suggested a hybrid feature extraction method on the fusion of many known features such as GDC, RCC, and PseTNC and proposed an optimized DNN achieving 95.81% accuracy. Te study in [46] explored the traditional feature selection methods and proposed the UFS-UDR method, and in [47,48], classifcation of RNA sequence and efcient feature extraction from that data using the iEnhancer-DHF model which works on DNA samples were discussed. In paper [49], a two-stage gene selection method is proposed as the solution for the feature extraction problem and the SVM and RF were the classifers used. Muhammad Ali et al. [50] analyzed the ANN and SVM classifers on the stock dataset and proved that the ANN performs better. In [51], the RPOS feature selection method is proposed and its performance on the RF, SVM, and KNN is analyzed. Ishfaq Ali et al. [52] in their research used a datadriven approach to decide on the number of clusters, K in the K-means clustering algorithm, and in [53], the KNN-based ensemble method is proposed and performance is evaluated.
Samarth Tripathi et al. [54] proposed two classifcation models, deep neural network and convolution neural network, where the prepared data with 99 features were given as an input to the classifer DNN, and for the CNN, the DEAP dataset was converted into a 2D image, to make the CNN learn from the image for classifcation. It was modularized to prove the efciency of neural in emotion classifcation. Ahmad and Olakunle [55] used the discrete wavelet packet transform (DWPT) in the work and the feature extracted in the work was entropy. It was concluded that compared to other statistical features like power and energy, entropy provides good accuracy. Pascal Ackermann et al. [56] extracted the features such as HHS, HOC, and STFT. Te feature selection algorithm used was mRMR which was best suited for categorical output class labels. Te classifers used were the random forest and SVM. Te output labels of classifcation were anger and surprise. Te random forest was found to be the best compared to the SVM in that study.

Modular Structure of the AP-ANN Frameworks
Te process involves developing a system to detect stress based on human emotions. In this proposed work, the pub is being used. Te DEAP dataset contains 32 fles, one per participant in a. dat or. mat format. Two arrays are generated for each participant as shown in Table 1. After gathering raw EEG data, preprocessing is performed on the data. Te dataset is raw such that it contains noise and artifacts; hence, it must be preprocessed to reduce the efect of this signal on feature extraction. Not all the channels contribute to emotion identifcation, so suitable channels are selected, and then, the multidomain feature set of time and frequency is obtained from the preprocessed signal. Te most signifcant features that contribute to emotion identifcation are selected, and Russell's valence-arousal model of emotions is applied to the class label output. Before training, since the dataset is of high dimension and class is imbalanced, class balancing and sampling algorithms were applied to help in improving the dataset after which it is given as input to the classifcation models. Figure 1 illustrates the proposed methodology.
DEAP is a database that is publicly available for the analysis of human emotions that contains the EEG and physiological signals from 32 participants that were collected while watching 40 one-minute videos, and the participants were asked to mark their real emotions on a fve-level scale as valence, dominance, arousal, like, and familiarity.

Channel Selection and Signal Decomposition.
Our goal of work is to recognize emotion from the EEG signals in the DEAP dataset. Using all the channels of the 10 × 20 system will result in data redundancy and an increase in computational time. So, in the proposed work, as Omid Bazgir et al. [57] suggested in their paper, only the frontal lobe channels are selected for research as it is proved that the left and right frontal regions of the brain contribute to emotion more than other channels. As a part of the experiment setup, the channels were selected as diferent pairs; an experimental selection (ES) of the channel is done, and the channels which respond the most during emotional change are detected and used in the research. Te channels that were selected are FC1, FC2, FC5, FC6, F3, F4, F7, F8, FP1, and FP2. After the collection of the EEG data, it is preprocessed which is the process of removal of noise and artifacts from the raw brain signals without losing the original data or information. Preprocessing also includes the process of smoothing the brain signal. In the proposed work, the DWTand ICA performance is compared to the noise and artifact removal from the obtained brain signal.

Discrete Wavelets Transform (DWT).
In many scientifc and engineering applications, discrete wavelet transform (DWT) is used as a signal processing tool. In wavelet transform, scaling functions and wavelet functions that are related to low-pass and high-pass flters, respectively, are involved. Te DWT is used to decompose, denoise, and recompose the EEG signal. Te DWT algorithm considers the input signal as a wavelet as it involves both frequency and time domains so that the time at which the variation occurs at maximum and minimum in the signal can be found. It gives spectral information about both the frequency and time domain, whereas the other processing techniques such as fast Fourier transform are only for frequency domain analysis. Te de-noising done by discrete wavelet transform is more efcient than other techniques as the de-noising of the signal is made without losing the original characteristics of the signal, because the de-nosing is performed after the decomposition of the signal. On the completion of the de-noising, the reconstruction of the signal is done to obtain the original noise-removed signal. Te de-noising process involves the steps as shown in Figure 2.
In the proposed work, the DWT algorithm has been used to split the EEG signal acquired from DEAP into approximation and detail coefcients using the fltering method. Usage of appropriate wavelet function and setting up the number of decomposition levels are the deciding factors of DWT performance. Te wavelet family contains diferent types of wavelets such as Daubechies, Haar, Symlet, Mexican, Hat, and Morlet [58]. In the proposed work, the Daubechies-8 wavelet is chosen for wavelet analysis, and eight-level decomposition is preferred because it is considered to be more efective for signal de-noising compared to other wavelet families. As compared and analyzed by previous studies [59,60], Daubechies is best suitable for analysis of the EEG signal due to its smoothening feature [61] and its accuracy is compared with other mother wavelet families. Te flters used in the DWT algorithm are the low-pass flter and high-pass flter as shown in Figure 3. After acquiring the low pass flter's approximation coefcient and the high-pass flter's detail coefcient at level 1, the level 2 coefcients can be obtained by applying the same decomposition procedure to the level 1 approximation coefcient. Similarly, the outputs from lowpass flters at each level are decomposed further. Tus, the acquired EEG signal from DEAP is decomposed into eight levels of coefcients that are CD1, CD2, CD3, CD4, CD5,  Computational Intelligence and Neuroscience CD6, CD7, and CD8, and an approximation coefcient is CA8 as in Figure 3. Te thresholding technique is applied to the obtained detail and an approximation coefcient. Te threshold value is calculated for each of the coefcients using the formula as follows: threshold � sqrt(2 * log(n)).
Soft thresholding is applied [61] after calculating the threshold value in which coefcients having values higher than the threshold value are minimized towards zero. Tus, the de-nosing of the coefcients is done, and the de-noised detail and approximation coefcients are obtained.
After thresholding, the reconstruction is done on the denoised coefcients to obtain a de-noised EEG signal. Te preprocessed EEG signal is partitioned into fve frequency subbands, alpha, beta, gamma, theta, and delta, as shown in Table 2.

Independent Component Analysis (ICA).
To compare with DWT, the EEG data are de-noised by ICA using the EEGLAB toolbox. Te ICA tries to maximize independence by linearly transforming the input signal into subcomponents such that the mutual information between these subcomponents is zero. Tis method assumes that each of the subcomponents generated is independent of each other. Te important aspect of ICA is that the number of input signals (S) of ICA and the number of subcomponents (C) generated must be the same. Te other two cases are shown as follows: source < components-overdetermined. (2) source > components-underdetermined.
Te EEGLAB toolbox is used for implementing ICAbased de-noising. A participant dataset acquired from DEAP  is loaded into EEGLAB. Te 'runica' algorithm is selected in the EEGLAB to decompose the input signals into subcomponents. Te input signals are acquired from the 10 selected channels of the DEAP dataset; therefore, 10 independent subcomponents are generated. Te order of the subcomponents generated is based on the variance of each component compared to that subcomponent with higher variance is rejected by the "pop_subcomp" function in EEGLAB. After the components are removed, the subcomponents are reconstructed to obtain the original 10 channels signals which are free from artifacts and noise.

Feature Extraction.
It is a process of transforming the original raw data into an optimal set of features for processing. In the proposed work, after preprocessing the EEG signal, the features are extracted from the de-noised EEG signal for the further classifcation process. Te feature set contains time domain and frequency domain-based features as shown in Table 3, extracted from the EEG signals. Time domain-based analysis is a statistical analysis that gives more information about the signal amplitude variation. Frequency-domain analysis gives more information about patterns in the signal. As many research articles suggest various features that perform well for the EEG signals, for time domain-based analysis, the statistical features as suggested in [34] were extracted, and frequency-domain features such as energy, log energy entropy, Shannon entropy, power spectral density, and absolute power as suggested in reference [47] were extracted and used in research. Tese features along with median and mode as statistical features contribute to 15 features. As there are 5 frequency bands, the features were calculated for each band, thus preparing the multidomain feature set. Tough frequencyand time-domain features have their limitations and advantages, the proposed multidomain feature set increases the accuracy of the classifcation. Tus, the feature set initially contains 75 features in total, of which 50 are time-domain features and 25 are frequency-domain features.

Feature Selection.
Te data with irrelevant or trivial features may lead to a reduction in the efciency and performance of the model. Hence, there is a need for selecting signifcant features that have more impact on the prediction accuracy. Feature selection is the process of selecting the salient features from the given dataset. It is used to eliminate the irrelevant features in data, thus improving the performance of learning and reducing the time consumed to train the model. In the proposed work, flter feature selection is used because it works fne with large datasets containing many features while the wrapper technique is expensive to run and complex for large datasets. In the proposed work, the Pearson correlation coefcient (PCC), a flter-based feature selection, is used. Correlation can be used to identify how one or multiple features are associated with other features.
As the frst step of feature selection, a temporary feature set with a total of 15 features is prepared by selecting the features of the preprocessed signal; then, the correlation is calculated between each feature to the output label using the Pearson correlation coefcient formula C as follows: where n is the number of samples, x i and y i are the ith data values of two sets {x1, x2,. xn} and {y1, y2,.yn}, and x and y are the mean values. Te correlation value (C) lies between -1 and +1, and if the score is near to +1, indicates that there is a strong positive correlation between features; that is, if one feature increases, another feature also increases or if one feature decreases, other feature also decreases. A correlation score near to -1 indicates a strong negative correlation; that is, if one feature increases another feature decreases and vice versa. Te correlation score of 0 indicates there is no relationship. In the proposed work, the correlation values of the 15 features are obtained from the Pearson correlation coefcient. Ten, these values are sorted in decreasing order, their ranking indexes are found, and the top 10 features are listed in Table 4. Initially, the frst feature is selected, and the feature set is expanded by adding the next feature in order. Tis process is called forward selection. Each time a feature is added, it is evaluated, and the prediction accuracy is calculated. Te process is continued until there is no improvement in the prediction accuracy and the best feature set is obtained. In the proposed work, with a subset of 10 features, the feature selection process is stopped.

Russell's Valence-Arousal Model.
In the proposed work, emotions have been used to classify stressed and unstressed states among people. Te people with positive emotions are in an unstressed state, while people with negative emotions are in a stressed state. To determine the output labels for the feature set, Russell's valence-arousal model of emotions is used as shown in Figure 4.
In the DEAP dataset, the responses from participants were labeled as various emotions in the valence-arousal model, each taking a value of x where x takes a value from 1 to 9. A threshold value of 5 is assigned so that the labels have been classifed as high and low. In the proposed work, if the valence is -ve and arousal is high, or the valence is -ve and arousal is low, then the output label is determined to be "1" and concluded as a stressed state. If the valence is +ve and arousal is high, or the valence is +ve and arousal is low, then the output label is determined to be "0" and concluded as an unstressed state as shown in Figure 5.
Tus, the fnal dataset contains a collection of the feature set with selected features and its corresponding output label with "1" (stressed) or "0" (unstressed). Te fnal prepared dataset is used to classify a person's stress-based valence and arousal values.

Features
Formulas Descriptions Variance is used to show the distribution of the EEG data points of the signal from their actual mean value.

Standard deviation
Te square root of the arithmetic mean of the square of the EEG signal is calculated.
Te skewness is the measure of distortion of the EEG signal data from the symmetrical distribution. Te symmetrically distributed data will have skewness 0.
Kurtosis measures the complexity of the EEG data points. Te higher kurtosis indicates the sharp peak of the signal is at the mean point. Te Shannon entropy is used to indicate the variation of the signal at each frequency scale. Power spectral density PSD � (1/N) * abs (×)^2 Te PSD is used to identify brain wave diferences in terms of frequency.
Absolute power Power(×)� (sum(x^2))/length(×) Te absolute power describes the power of the entire signal. 8 Computational Intelligence and Neuroscience 3.6. Stratifed Sampling. As the proposed model is dealing with the data of high dimensionality, using all the records to train the model increases the computational time and thus an efcient sampling method is suggested to fnd the optimal set of records to train the model. Stratifed sampling is a sample selection technique in which the records of interest are being subdivided into homogeneous clusters or strata and a representative from each cluster is taken as a sample for analysis [62]. In this project, afnity propagation is being used for the formation of strata, which is a clustering algorithm that does not require specifying the number of clusters prior as it is based on message passing between the exemplars. An exemplar is the unique data point that forms the centre of the cluster. Tis similarity is taken as input, and it is calculated using the negative Euclidian distance square between each data point as follows: Te similarity s (a, b) indicates how well point b is suited to act as an exemplar for point a. Te diagonal of s (a, b) where a � b is known as "preference" which has control over the number of clusters generated. Once the similarity between data points is found, the messages which include responsibility and availability values are exchanged between the data points. Te responsibility r (a, b) is represented as follows: where b is competing for exemplar. Te responsibility r (a, b) represents the messages sent from the data point a to the exemplar b indicating how well the point b is to be an exemplar for point a. Te availability avail (a, b) represents the messages sent by exemplar b to point a indicating how well a selects b to be its exemplar. Te availability avail (a, b) is represented as follows: r(a, b)), (7) where a ≠ b. Te important parameter in afnity propagation is the damping factor λ which avoids the numerical oscillation while exchanging messages. Te addition of the damping factor to the responsibility and availability are shownas follows: avail � (1 − λ) avail + λ avail.
Te damping factor can have a value from 0.5 to 0.9, and in the proposed work, the damping factor is fxed to 0.5. Te responsibility and availability matrix are updated until the maximum iterations are reached, or values fall under a certain threshold, or values remain constant. Once the updation of responsibility and availability matrix is completed, the fnal exemplars are computed by calculating criterion matrix which is represented as follows: res(a, b) + avail(a, b). (10) Here, b with the highest criterion value in each row of c (a,b) is an exemplar for data point a. Te data points that have common exemplars are grouped under the same cluster. In the proposed work, the training dataset has been prepared using the AP to train the classifer model to increase its efciency and performance. Te population is divided into strata through afnity propagation and thereby exemplar which is a particular data record that represents the entire data records chosen from each stratum from which better accuracy and performance can be obtained. Similarly, the output label of the exemplar is obtained by selecting labels of data records that appear most in the attribute.

Emotion Classifcation for Stress Detection.
In the proposed work, a pattern recognition network which is a feedforward backpropagation neural network (FFBPNN) is trained to classify the inputs depending on the output. Te fow of information starts from the input node and then to the hidden layer and fnally to the output nodes in the feed-forward network. Te backpropagation algorithm is a training method of the neural networks that compares the actual outputs with the expected outputs and the error is calculated, and based on it, the weights of layers are adjusted backwards from the output layer to the input layer.
In the proposed work, the neural pattern recognition toolbox of the MATLAB framework has been used to train the feature set. As an initial step, the pattern recognition network randomly assigns weight and biases to the nodes in the neural network. As the extracted samples with their features selected are given as input to the network input layer, then the input vector is divided independently at a ratio of 4 : 1 (80% training set and 20% test set). Tis ensures that unknown samples were fed into the classifer during testing, and thus, the performance of the model is analyzed. Te ANN model is compared with the random forest and SVM and proved to provide better classifcation results; thus, the neural network classifcation model along with afnity propagation is used to detect the stress of the participants when watching the videos (See Algorithm 1).

Results and Discussion
In the process of fnding the stress of the participants from the DEAP dataset, initially, the analysis of the dataset is done, the workfow is framed, and the workfow diagram of the AP-ANN model for the DEAP dataset emotion classifcation process model is as given in Figure 6.
Initially, a dataset is prepared using the raw EEG recordings of 40 channels of 32 participants in the DEAP database where each of them must take up 40 trials and 15 features for 5 sub-bands, which yield to a total of 75 features obtained by DWT from the EEG recordings of DEAP. With the initial feature set, a matrix of 51200 * 75 is formed which is Computational Intelligence and Neuroscience given as input to the three classifers (neural network, random forest, and SVM) and Figure 7 shows the performance comparison.
Ten, a preprocessing step is involved after selecting the most important 10 channels that contribute to the identifcation of stress levels. Tis step involves de-noising the signal, a comparison is done between the two methods DWT and ICA, and again the performance is compared on the various performance metrics as shown in Figure 8.
Still, the classifcation accuracy of this preprocessed dataset is not appreciable as there was a class imbalance, and the dataset is high dimensional. Dimensionality reduction is done, by selecting the 10 signifcant features using the Pearson coefcient method. After the completion of the selection process, 50 features are calculated with selected 10 channels for 32 participants. Te feature vector is given as an input to the classifers (SVM, random forest, and neural networks), and a detailed comparison before and after the feature selection is done and is shown in Figure 9. Now, the dataset is (32 participant * 40 trial * 10 channel) * 50 features, i.e., 12800 * 50. Te performance of the classifer is checked for 10, 20, 30, and 40 trials, and it was found that no huge variation in the performance standards happens, as the trial increases from 20 to 40; i.e., the dataset is (32 participants * 20 trial * 10 channel) * 50 features, i.e., 6400 * 50, and so frst 20 trials were selected for further process. Tis comparison study is given in Table 5. Moreover, this process reduces a lot of the training time, but still, the precision of the algorithm is very low due to the imbalanced nature of the classes.
In the process, overftting seems to be one stopping factor for performance enhancement, and the dataset is balanced with the ADASYN for preventing the classifers from overftting and to improve the classifcation rate. After class balancing, the precision is greatly improved as in Figure 10.
After the ADASYN process, the addition of 4095 samples of the stressed class is added synthetically resulting in our fnal dataset of 10495 * 50 features and one output label. Te computational time taken for the entire process seems to be huge when given to the classifer and so a stratifed sampling approach is used to handle it. Various cluster-based sampling methods like the AP, K-mean, and K-medoid were experimented with stratifed sampling method, and their clusters are validated with the Davies-Bouldin index, Dunn index, and Silhouette index. Comparison is shown in Figure 11. Based on the performance parameter, the afnity propagation clustering can be either based on minimum preference value or median preference value. Te performance of both the preference value is analyzed in the same fgure.
Afnity propagation (AP) shows better clustering consistency as a stratifed sampler and its preference along with various classifers such as the ANN, RF, and SVM are illustrated in Figure 12. Te performance is analyzed on the fve evaluation metrics among which accuracy and specifcity seem to be far better in the AP-ANN compared to AP-RF and AP-SVM.
Te 10-fold cross-validation [63] which is the error estimation method generally has a lower bias than other methods and is not appropriate to classify the original unbalanced DEAP dataset, but our ADASYN balanced Input: Te EEG signal of 32 participants watching 40 one-minute videos from 40 channels Output: Stressed and unstressed state of the participant by classifying emotion Step 1: Out of 40 channels only 10 channels were selected in our research experiment Step 2: Preprocessing the EEG signal using DWT to remove noise and artifacts as the signal is decomposed into 8 levels using the low pass and high pass and highly distorted signals werenullifed and recomposed to 5 sub-bands frequencies.
Step 3: Most signifcant 15 features were calculated but among those using the PCC method only the top 10 features that are highly correlated are selected.
Step 4: Using the ADASYN algorithm, the class balancing is done to improve the minority class samples.
Step 5: Stratifed sampling is done using the afnity propagation and exemplars are selected from each stratumand used as representative samples and used to train the neural network Step 6: Te labels of arousal and valence determine the stress level using the below condition Step 7: If (arousal is high) and (valence is −ve) || if (arousal is high) and (valence is +ve) { Classify it as stressed } else { Classify it as unstressed } Step 9: Tis dataset is used to train the fast forward back propagation neural network and the accuracy and other performance metrics were computed. DEAP dataset can be validated, and the results of this 10-fold validation are compared and illustrated in Figure 13.
Te afnity propagation (AP) is the best-stratifed sampler, and having its preference value as minimum (min) along with the neural network classifer gives better performance. As our proposed system yields better accuracy, high precision, better recall, and specifcity, and a decent F1 score, a performance comparison of the proposed AP-ANN    Figure 9: Performance before and after feature selection.

12
Computational Intelligence and Neuroscience    model with the performance results of various algorithms that used the DEAP dataset in various other research papers is carried out and shown in Table 6.

Conclusions
Te proposed multimethod fusion model based on the AP-ANN approach for stress detection by analyzing the bad and unhappy emotions provides more promising results, and the fndings are DWT provides better performance compared to ICA in de-noising. It is used to extract features of both time and frequency domains, and highly correlated features that were selected increase the system's efciency. Te upsampling of minority classes using ADASYN removed the threat of overftting, and performing stratifed sampling using the AP clustering ensures best ft representatives are used to train the classifer model and attain greater performances. Te classifcation accuracy has been compared among the most signifcant three classifcation algorithms such as the SVM, neural network, and random forest among which the neural network has achieved a high accuracy of 86.8% which is 9% better than the result obtained without afnity propagation, 16% better than the result obtained without ADASYN and AP, and 29% better result than that of the classifcation of the preprocessed data. Furthermore, the proposed method has some limitations as, in the DEAP dataset, a single evaluation may not be enough to rightly represent the emotional state of the participants as the video extracts are played for 60 sec, there is a tremendous chance for many emotional states to be evolved in that period. When dealing with imbalanced classes using the ADASYN algorithm, adjusting and refning the data have its boundaries. Tough balanced data classes have an experimental need, creating new data can never replace the original features. Stratifed sampling generally cannot be used in all studies as it has a drawback that each member of the population must be studied individually to fnd the best representative sample and also fnding an exhaustive list of representative samples is very challenging. Te future scope of research in this domain is the fusion of other physiological data from various sources that can be used along with the EEG signals as a hybrid model to improve the performance of emotion classifcation.

Data Availability
Te data that support the fndings of this study are available from the DEAP dataset, a dataset for emotion analysis using the EEG, physiological, and video signals in the following link: http://www.eecs.qmul.ac.uk/mmv/datasets/deap/ download.html. Te license of the dataset is for academic research only and not publicly available.

Disclosure
Te research was performed as part of the employment of the authors working at Kumaraguru College of Technology.