Nowadays, there is a digital era, where social media sites like Facebook, Google, Twitter, and YouTube are used by the majority of people, generating a lot of textual content. The user-generated textual content discloses important information about people’s personalities, identifying a special type of people known as psychopaths. The aim of this work is to classify the input text into psychopath and nonpsychopath traits. Most of the existing work on psychopath’s detection has been performed in the psychology domain using traditional approaches, like SRPIII technique with limited dataset size. Therefore, it motivates us to build an advanced computational model for psychopath’s detection in the text analytics domain. In this work, we investigate an advanced deep learning technique, namely, attention-based BILSTM for psychopath’s detection with an increased dataset size for efficient classification of the input text into psychopath vs. nonpsychopath classes.
According to psychology, traits provide a way of describing a person, such as generous, out-going, and short-tempered. The trait-driven approach is the most focused area in psychology literature. Trait depicts a person’s characteristic to make a response and reaction to a certain situation in a specific way [
Perceive them as being cunning, antisocial, and manipulative, and such individuals merely exist about 1% of the population [
The traditional techniques of identifying psychopaths include the following: Psychopathy Checklist-Revised, PCL-R, Welsh Anxiety Scale [
Few studies have applied computational models for psychopath detection, covering machine learning and deep learning-based techniques. The machine learning techniques include Support Vector Machine (SVM), Random Forest, Logistic Regression, Multilayer Perceptron (MLP), Random Forest (RF) 100, SVM [
The rapid evolution of social media networks like Facebook, Twitter, and YouTube has allowed users to communicate information by interacting with the community. Social media’s users exploit their status updates, images, text, and public profiles to express themselves [
This work is aimed at the development of an automated method that can filter psychopaths from nonpsychopaths by using textual content available on social media sites. The study conducted by [
The aforementioned limitations can be overcome by exploiting BISLTM for classifying reviews into psychopath and nonpsychopath classes. BiLSTM layer retains both the past and future context information using long-term dependency in a review in a sentence to predict the sentence class.
In this work, we address the problem of psychopath personality detection from text. To distinguish psychopath review from nonpsychopath, the psychopath classification problem is taken as binary classification task. The training dataset
RQ.1: how to classify the input text into psychopath and nonpsychopath class by applying deep learning technique, namely, BILSTM?
RQ.2: what is the performance of the proposed deep learning system, namely, BILSTM as compared to different machine learning and deep learning techniques?
RQ.3: how to evaluate the effectiveness of the proposed system w.r.t the baseline techniques.
The rest of the article is organized as follows: Section
This section provides a review of the previous works conducted on psychopath’s detection.
In their work on identifying the relationship between personality and online behavior, Whitty et al. [
The student personality profiles were collected, and feedback was analyzed with respect to their learning experiences [
Bukhtawer et al. [
Keshtkar et al. [
Liu et al. [
YouTube-based personality recognition is performed by [
Pednekar and Dubey [
Khan et al. (2017) developed a personality item pool in Urdu language using a well-known translation model called Darwish. The proposed questionnaire is an important tool in user’s own language, i.e., Urdu. The internal consistent checks show that Urdu is more consistent than English.
The aim of this study [
Using Dominance, Influence, Compliance, and Steadiness assessments, Ahmad and Siddique [
Tandera et al. [
In their work on Extraversion personality traits, Shaheen et al. [
The purpose of the work performed by (Hancock et al. [
Hancock et al. [
To identify the psychopathy using Twitter account information, Wald et al. [
Le et al. [
This section describes the proposed methodology developed for classifying the text into psychopaths and nonpsychopath classes using deep learning model, namely, BiLSTM. The job of BiLSTM is to store the past information using Backward LSTM and future information using Forward LSTM [
Proposed system.
The detail of each module is presented as follows.
The first module of the proposed methodology is comprised of two steps.
In this step, we acquired the required dataset from different social media sites like Facebook and Twitter. For example, we used the hashtag “#psychopath” to crawl required tweets using a Python-based library, namely, Tweepy [
To conduct the experiments, first, we split the dataset into two segments, namely, train set and test set by exploiting sklearn train_test_split method [
In Table
Dataset description.
Dataset | User reviews | Personality class |
---|---|---|
Personality data | 601 | Psychopath (300), nonpsychopath (301) |
In data cleaning task, the cleaning of the data is performed to maintain the original input text. It helps to enhance the accuracy of the text classification process. The reason for applying the data cleaning module is that the user input text in real-world contain a significant amount of noise, so it is needed to clean the data from this noise in order to perform different NLP tasks (text classification) [
In this technique, the input text is transformed into lower case by using python based Script.
Some special characters like “#,” “%,” “?,” “@,” “-,” “&,” “$,” “/,” and “
The objective of the tokenization process is to convert the input text into small tokens/pieces. To perform tokenization, we used keras tokenizer [
In this segment, we describe different layers used in the proposed deep neural network model called BILSTM, for the detection of psychopath from input text as shown in Figure
This layer performs word embedding using keras embedding layer. The purpose of this layer is to present a word-level representation in which word indices are transformed into embedding.
The aim of this layer is to prevent overfitting. The rate parameter of the dropout layer is set to the specified threshold (i.e., 0.7), where its range lies between 0 and 1. This layer is placed after the embedding layer to control the random activation of neurons in the Embedding layer.
The BiLSTM layer acts as a second layer of the proposed model that receives input from the embedding layer and then transforms it into new encoding. The BiLSTM performs 2-way encoding by maintaining not only the previous but also the future information.
Finally, the activation function of softmax is used at the output layer for performing the classification task. In this layer, the input text is classified into psychopaths and nonpsychopath’s sentences.
The proposed architecture of the BILSM model for personality detection into binary classes such as psychopath and nonpsychopath exploits five major phases: (i) embedding layer, (ii) dropout layer, (iii) BILSTM layer, and (iv) output layer.
Initially, a sample input text “I just wanted to see how it felt to shoot grandma,” is taken, then, it is moved sequentially through the different layers of the proposed deep neural network model. Each layer is elaborated in the following way.
The sample input text needs to be prepared in numerical form so that the deep learning model can be applied. In this connection, the numerical representation of the given input text starts with the tokenization process. The keras tokenizer method “tokenizer.fit_on_texts” is used to that attaches an integer value to each individual word, such as [“I:1” “just:2” “wanted:3” “to:4” “see:5” “how:6” “it:7” “felt:8” “to:9” “shoot:10” “grandma:11”]. Moving forward, the input text is converted into integers sequence like [1,2,3,4,5,6,7,8,9,10,11] using another keras tokenizer method “tokenizer.text_to_sequences.” Finally, the input is prepared and made input to the initial layer of the proposed deep learning model.
The primary layer of the model translates already obtained individual integers into low dimensional feature (embedding) vectors that assist in taking syntactic and semantic information. For instance, a sample word “
The next layer is the dropout layer that comes after the embedding layer. The values of the dropout rate parameter ranges between [0,1]. Its job is to reduce the problem of overfitting.
We employ deep learning technique, namely, BiLSTM to classify the text into psychopaths and nonpsychopath’s sentences. For capturing both syntactic and semantic information of a sentence, Long Short-Term Memory (LSTM) neural network has shown considerable performance improvement. The functionality of Bidirectional Long Short-term memory (Bi-LSTM) architecture includes the processing of text in both directions using dual LSTM layer, which allows us to capture both previous and subsequent context of a given sentence [
Bi-LSTM model has gained much attention recently due to its superior ability to maintain sequence information by considering both past and future context, as both the contexts have equal importance [
The BiLSTM network is composed of two subnetworks: Forward LSTM and Backward LSTM, respectively [
Applying softmax function at
The following equations are used for the computation of both Forward and Backward LSTM.
Forward LSTM equations:
Backward LSTM equations:
The final layer applies softmax activation function to predict the class tag probability like “psychopath” and “non-psychopath” by exploiting Eq. (
After passing through the softmax activation function, it is found that the “psychopath” class tag achieved the maximum probability, so the given input text “” is labeled as “psychopath” (see Figure
Algorithm
Pseudocode regarding personality detection in input text by exploiting BILSTM. Start Section 1. 2. 3. Allocate index to related word 4. 5. Hyperparameter Initialization 6. train set size=90%, test set size=10%, max-features=2000, embed_dim=128, batch_size=32, epochs=7 Section 7. 8. Create embedding vector of entire words in T = [t1, t2, t3, t4, … , tm] //Convert text to machine readable feature(word) vector 9. Apply dropout layer for overfitting reduction 10. Apply operation of BILSTM using Eq. (( 11. Section 12. 13. Developed a Train Model Apply softmax operation (using Eq. ( 14. End while Terminate
In this section, we present results and their analysis on account of conducting different experiments in response to research questions formulated in Section
In order to classify text into psychopath and nonpsychopath type, we have used different Bi-LSTM models using varying parameters. The parameter setting for the proposed Bi-LSTM model is shown in Table
Parameters settings of the proposed Bi-LSTM model.
Parameter | Value |
---|---|
Input vector size | 100 |
Vocabulary size | 2000 |
Embedding dimension | 128 |
Bi-LSTM unit size | 100, 150, 200, 250, 300, 350 |
Number of hidden layers | 2 |
Activation function | Softmax |
Number of epochs | 7 |
Batch size | 32 |
We conducted different experiments to answer RQ1; detail is given as follows.
We conducted an experiment with varying parameter settings of different Bi-LSTM models. Table
Estimation metrics results of Bi-LSTM models.
Model name | Recall | Precision | |
---|---|---|---|
Bi-LSTM(1) | 0.84 | 0.84 | 0.84 |
Bi-LSTM(2) | 0.82 | 0.82 | 0.82 |
Bi-LSTM(3) | 0.74 | 0.74 | 0.74 |
Bi-LSTM(4) | 0.85 | 0.85 | 0.85 |
Bi-LSTM(5) | 0.80 | 0.80 | 0.80 |
Bi-LSTM(6) | 0.80 | 0.80 | 0.80 |
During experimentation, we recorded the test accuracy, loss score, and training time for all the Bi-LSTM models with different parameter settings, as listed in Table
Test accuracy, loss, and training time of Bi-LSTM models.
Model name | Test accuracy | Test loss | Training time(s) |
---|---|---|---|
Bi-LSTM(1) | 0.84% | 1.39 | 5 s |
Bi-LSTM(2) | 0.82% | 1.41 | 8 s |
Bi-LSTM(3) | 0.74% | 1.56 | 15 s |
Bi-LSTM(4)proposed | 0.85% | 1.39 | 24 s |
Bi-LSTM(5) | 0.80% | 1.42 | 49 s |
Bi-LSTM(6) | 0.80% | 1.44 | 39 s |
To answer RQ2, we conducted different experiments on various machine learning classifiers and the proposed deep learning model. Detail is presented in the following subsections.
To perform comparison of the proposed BI-LSTM model with different machine learning classifiers, we implemented each ML classifiers on the acquired dataset. The performance evaluation results are presented in Table
Comparison with machine learning techniques.
Machine learning classifiers and proposed model | Accuracy | Precision | Recall | |
---|---|---|---|---|
Decision tree | 75.41% | 0.76% | 0.77% | 0.75% |
SVM | 72.13% | 0.76% | 0.76% | 0.72% |
KNN | 70.49% | 0.73% | 0.70% | 0.71% |
MNB | 68.85% | 0.76% | 0.74% | 0.69% |
LR | 65.57% | 0.65% | 0.66% | 0.65% |
RF | 73.77% | 0.76% | 0.74% | 0.74% |
XGBoost | 77.05% | 0.77% | 0.77% | 0.77% |
Proposed (BiLSTM) | 85% | 85% | 85% | 85% |
It is obvious that the proposed DL model outperformed different ML classifiers in terms of better precision (0.85%), recall (0.85%),
To perform comparison of the proposed BILSTM model with different deep learning classifiers, we implemented each DL classifiers and the proposed BILSTM classifier on the acquired dataset. The performance evaluation results are presented in Table
Comparison with deep learning techniques.
Deep learning classifiers | Accuracy | Precision | Recall | |
---|---|---|---|---|
CNN | 0.69% | 0.74% | 0.69% | 0.69% |
LSTM | 0.79% | 0.79% | 0.79% | 0.78% |
GRU | 0.82% | 0.82% | 0.82% | 0.82% |
RNN | 0.72% | 0.75% | 0.72% | 0.72% |
Proposed (BiLSTM) | 0.85% | 0.85% | 0.85% | 0.85% |
It is obvious that the proposed BILSTM model outperformed different DL classifiers in terms of better precision (0.85%), recall (0.85%),
While answering RQ3 we evaluated the performance of the proposed system with respect to the baseline study. The performance evaluation results of comparing studies and the proposed model are presented in Table
Performance evaluation with baseline studies.
Study | Technique (s) | Results |
---|---|---|
Wald et al. [ | (i) LR | 73% (accuracy) |
Sumner et al. [ | (i) SVM | 0.639% (accuracy) |
Preotiuc-Pietro et al. [ | (i) Unigram | Pearson correlation |
Proposed (our work) | Bi-LSTM | 0.85 (precision) |
Wald et al. [
Sumner et al. [
Preotiuc-Pietro et al. [
The proposed BILSTM model when applied on the labeled dataset yielded the best performance results in terms of precision (85%), recall (85%),
The current research work is aimed at performing the text classification into psychopath and nonpsychopath by exploiting a deep neural network model called Bi-LSTM. The proposed study consists of different modules: (i) data collection, (ii) preprocessing, and (iii) applying deep learning model (BiLSTM).
First, the numerical representation (continuous values) of input text is performed at embedding layer, then proposed Bi-LSTM stores the sequence information in the two directions that are from left towards right and right towards left because it holds two layers known as forward layer and backward layer. These two layers of the Bi-LSTM model assist in preserving the context information in forward and backward directions generating a rich depiction of input text. Finally, the classification of input text is executed at the output layer using the softmax activation function into two classes known as psychopath and nonpsychopath. During experiments, the implementation of different machine learning models and deep learning models is performed on the given dataset. The conducted experiments reveal that the proposed Bi-LSTM model outperformed all the other comparing models and obtaining the improved results in terms of precision (85%), recall (85%),
The following are the limitations of the current research work.
The dataset collected for experimentation is insufficient that may decay the performance of the proposed model The present study is limited to the implementation of the BI-LSTM model without applying the fusion of different deep neural networks such as CNN + LSTM, CNN + Bi-LSTM, CNN + RNN, and CNN + Bi-RNN We exploited only the random word embedding for the input layer, while other different word representation techniques like Glove and FastText may improve the system performance The focus of the present study is on English textual content The current study lacks the exploitation of different kinds of features like audio, video, and images that may assist in system performance
The future aim is to enhance the size of the dataset that can assist in attaining improved results regarding the proposed Bi-LSTM model Exploitation of other different deep neural networks like CNN + LSTM, CNN + Bi-LSTM, CNN + RNN, and CNN + Bi-RNN In the future, we aim to apply different word representation schemes like Glove and FastText The possible extension of the present work will be to exploit different languages for personality classification other than English text In the future, we inspect various features like audio, video, and images other than textual features We aimed to extend the preprocessing modules by exploiting other text cleaning steps like grammar correction, spell checker, and stemming
Underlying data supporting the results can be provided by sending a request to the 3rd author or corresponding author.
The authors declare that they have no conflicts of interest.
The authors are grateful to the Deanship of Scientific Research, King Saud University for funding through Vice Deanship of Scientific Research Chairs.