A Framework for Determining the Big Five Personality Traits Using Machine Learning Classification through Graphology

. Along with the progress of the times, the development of graphology has changed towards computerization. Te fundamental problem in automated graphology is how to determine personality traits through digital handwriting using the principles of graphology. Although various models and approaches have been developed in research related to automated graphology, there are still obstacles to overcome such as the selection of preprocessing techniques and image processing algorithms to extract handwriting features and proper classifcation techniques to get maximum accuracy. Terefore, this study aims to design a reliable framework using image processing and machine learning approaches such as fltering, thresholding, and normalization to determine the personality traits through handwriting features. Ten, handwriting features are classifed according to the Big Five model. Experiments using the decision tree, SVM (kernel RBF), and KNN produced an accuracy above 99%. Tese results indicated that the proposed framework can be well applied to predict the personality of the Big Five model through handwriting analysis features.


Introduction
We already know that handwriting is a way of communication between humans and that handwriting interprets the ideas that exist in the human brain. Generally, handwriting has a unique pattern, just like the pattern of human fngerprints. Tis fundamental thing is the reason why handwriting can be analysed to determine human behaviour and personality. Handwriting analysis can be used as a means of self-introspection to fnd out the strengths and weaknesses of a person. Science that studies human personality through handwriting is called handwriting analysis or better known as graphology. Graphology can identify and predict human personality by fnding patterns in the handwriting that provide essential information about the writer's mental, physical, and emotional state and behaviour.
Te development of graphology has changed towards computerization and has become a separate feld of research today. Te fundamental problem in computerized graphology is how to determine human personality through digital handwriting using the principles of graphology. Te frst research that discusses computerized graphology is called computer-aided graphology using the principles of pattern recognition which consists of three main stages, namely, preprocessing, feature extraction, and classifcation [1]. From these stages, it becomes a model or approach that cannot be separated in building computerized graphology. After that, it developed rapidly and became a separate research area for determining a human personality through handwriting.
Te Five Factor Model (FFM) of personality is a set of fve broad personality trait dimensions, often referred to as the "Big Five Model," which consist of openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism [2]. Te application of the Big Five model has been consistently associated with career guidance and job performance [3], analysing fnancial behaviour [4], employee recruitment [5], and marital relations [6]. A study discussed by the authors of [7] obtained the results that the Big Five model is better than other psychometrics such as the MBTI.
Manual classifcation of personality traits based on handwriting analysis by the graphologist needs more time and high cost. Machine learning involves the use and development of computer systems that are able to learn and adapt without following explicit instructions by using algorithms and statistical models to analyse and draw inferences from patterns in data [8]. Several studies apply personality psychology measurement techniques based on mapping and combination of several handwriting features. Handwriting analysis features such as baseline, slope, pen pressure, connecting stroke, letter "t," letter "f," and spacing between lines are combined to determine human personality and behaviour based on the fve-factor model [9]. In another study, the FFM was used to determine personality traits using several features such as baseline, letter "t," line spacing, word spacing, and pen pressure and classifed using the PersonaNet algorithm based on the CNN model [10]. Other measurement techniques such as Myer-Briggs Type Indicators (MBTIs) are also used to determine the personality traits of a person with a combination of classifcation techniques such as ANN, SVM, template matching, and KNN [11,12]. In addition, the Enneagram model, which is one of the psychological measurements, combined with the C-mean technique, produces personality groupings which are divided into nine personality types, namely, the reformer, helper, achiever, individualist, investigator, loyalist, enthusiast, challenger, and peacemaker [13].
In this study, we present a classifcation model of personality traits through handwriting with the Big Five model architecture using image processing and machine learning approaches. Te model architecture is presented starting from the preprocessing stages which include noise removal, thresholding, segmentation, and normalization. Furthermore, at the feature extraction stage, features such as baseline, top margin, line spacing, word spacing, letter size, slant, and pen pressure are extracted using an image processing approach using the OpenCV library [14]. Ten, the classifcation stage presents the psychological grouping of the human personality based on the results of handwriting extraction. In this classifcation stage, it consists of three steps: the frst step is to determine the decision rules for each class based on the features of handwriting analysis, the second step is to map the features for psychological identifcation by applying the Big Five personality psychology method, and the third step is to classify the Big Five personality from the handwriting images with a machine learning approach based on the psychological identifcation mapping.
Te main contributions of this paper are as follows: (1) We proposed a framework to determine Big Five personality traits through handwriting images using machine learning classifcation. (2) From the experiments, it can be seen that the proposed framework is very efective and performs the state-of-the-art classifcation methods for determining the Big Five personality traits through handwriting images.
Te organization of this paper is as follows: Section 2 provides the materials and methods, Section 3 provides the related works, Section 4 describes the methodology, Section 5 gives the results, Section 6 describes the discussion, and Section 7 gives the conclusion.

Materials and Methods
In this study, we use a public handwriting database from the IAM handwriting database [15]. It contains English handwriting text forms that can be used to train and test handwriting text recognition and perform author identifcation and verifcation experiments. Te database contains unlined handwriting text forms, which were scanned at a resolution of 300 dpi and saved as a 256 grey-level PNG image format. Te IAM handwriting database consists of 657 participants who have contributed to creating the database, 1539 handwriting text pages, 5685 labelled sentences, 13353 labelled text lines, and 115320 labelled words.

Related Works
Many researchers have published papers on handwriting analysis classifcation. Table 1 presents a brief overview of  the author's contribution to the automated handwriting  analysis. From what has been described in Table 1, the current study is still lacking on how to build a framework for handwriting analysis which is indicated by the fact that the accuracy obtained is still below 90% [4,9,10,12,19,21]. Joshi et al. [18] developed a classifcation framework based on the support vector machine (SVM) that achieved 97% classifcation accuracy. Te template-matching technique can be useful to extract the individual letter. It needs more template databases to get a better result. Naturally, a larger template database can consume more time for training [9,18]. Te deep learning architecture shows impressive results [20,22]. Pathak et al. [22] developed a deep neural network architecture model that obtained 97.7% accuracy. Disadvantages of this technique are that it requires more computational resources and is prone to overftting problems [24]. Related to those studies described above, this study aims to build a framework for predicting personality traits based on the Big Five personality model in terms of graphology using machine learning approaches. Tis research is expected to be an alternative in terms of assessing a human personality through handwriting.
In the next section, we discuss the theoretical models of each part of the proposed framework.

Methodology
As mentioned in the previous section, our research aims to build a framework for predicting personality traits based on the Big Five personality model in terms of graphology using machine learning approaches. Figure 1 shows the framework of our proposed research. An explanation of each process is described in the following subsections.

4.1.
Preparing the Dataset. Te system design begins by cropping the handwriting image from the IAM database [15]. Te image cropping process is intended to remove unnecessary parts from the image in the feature extraction process. Each cropped image is stored in the PNG format with the entire image width measuring 850 pixels and the image height adjusting to the existing handwriting text content. Figure 2 describes a handwriting image from the IAM database before and after the cropping process.

Preprocessing.
Some noise is still present in the handwriting image generated during the scanning process. Tis noise must be removed from the image to produce optimal feature extraction. Te fltering technique using bilateral fltering in the OpenCV library is used in this study [25,26]. After the fltration technique is performed, the next step is to binarize the handwriting image; in this case, the thresholding technique is used in the OpenCV library [27]. Te selection of the thresholding technique is based on the dominance of 2 colour intensities in the handwriting image. Te third stage of preprocessing is the stage of normalizing the handwriting image using dilation, contour, and afne transformation techniques, still using the library in OpenCV [28]. Tis stage aims to separate each line of text and words which will later be used to determine the distance between spaces, both lines and words.

Extraction of Handwriting Analysis Features.
After the preprocessing stage was performed, certain handwriting analysis features were required to be extracted from the database of handwriting samples. Based on [29], the features that will be used include baseline, top margin, line spacing, word spacing, letter size, slant, and pen pressure. All process of extracting the features used the OpenCV library.

Baseline.
Te baseline feature of handwriting is an invisible line on which the bottom of the middle zone letters aligns [29]. To determine the classifcation of the baseline angle value, if the baseline angle is positive, then it is categorized as descending (baseline > 0°), and if the baseline angle is negative, then it is categorized as ascending (baseline < 0°). Table 2 gives the details of the baseline feature and its characteristics.

Letter Size.
Letter size is determined by calculating all the text lines in the middle zone. Te average letter size of all lines will be the letter size value. Te size of the middle zone estimates the letter size without considering upper and lower zones. To determine the letter-size classifcation of the Performance measure (accuracy, precision, recall, F1-score, k-fold cross validation, mean absolute error, root mean square error) handwriting sample, the middle zone portion of the line of the text is calculated. Te letter size in the normal category is about 1/8 inch (3.175 mm) [29]. Te letter size that is more than 1/8 inch is categorized as larger than normal and less than 1/8 inch is categorized as smaller than normal size. Table 3 gives the details of the letter size feature and its characteristics.

Line Spacing.
Te amount of space in each line of the text is said to be line spacing [29]. To determine the classifcation of line spacing, the normal spacing is around 2-3x the size of the letter (in the middle zone, excluding the upper and lower zones). Line spacing less than 2x the letter size is categorized as narrow line spacing, while line spacing more than 3x the letter size is categorized as wide line spacing. Table 4 gives the details of the line spacing feature and its characteristics.

Word Spacing.
Te amount of space in each word of the text is said to be word spacing [29]. To determine the classifcation of word spacing, the normal spacing is 1x the size of the letter (in the middle zone, excluding the upper and lower zones). Word spacing less than 1x the letter size is categorized as narrow word spacing, while word spacing more than 1x the letter size is categorized as wide line spacing. Table 5 gives the details of the word spacing feature and its characteristics.

Top Margin.
To determine the classifcation of the top margin is the same with line spacing, the normal top margin is 2x the size of the letter (in the middle zone, excluding the upper and lower zones) [29]. Te top margin less than 2x the letter size is categorized as a narrow top margin, while the top margin more than 2x the letter size is categorized as a wide top margin. Table 6 gives the details of the top margin feature and its characteristics.

Pen Pressure.
Extraction of pen pressure is taken from the average value of all nonzero pixels (handwriting text pixel intensity) divided by the number of pixels counted after the binarization process. Te pixel intensity value above 180 is categorized as heavy, the pixel intensity below 140 is categorized as light, and the rest is normal. Table 7 gives the details of the pen pressure feature and its characteristics.

Slant.
Te slant of writing refers to the direction of the letter slope and is determined by the angle formed between the downstroke of the baseline [29]. To fnd the angle of the slant, the deslanted technique was used, which was proposed by Luettin and Luettin [30]. Te deslanting technique is based on the hypothesis that each "word" is deslanted when the number of columns containing a continuous stroke is maximum [30]. From this technique, for each angle in a suitable range, a shear transformation is used.      From the explanation above, the next step is to map the features of graphology with the types of Big Five personality. Te correlation between these features is presented in Table 9.

Personality Trait Classifcation.
After mapping, the next step is to classify the personality using several machine learning approaches. Te fve factors of the Big Five model are predicted with the mapping that has been performed. Terefore, there are 5 separate labels for each personality psychology trait and 5 classifcations for each Big Five (FFM) model. Te classifcation process uses 3 diferent machine learning algorithms including the SVM, KNN, and decision tree.
SVM is a supervised learning method with the concept of building a hyperplane or a collection of hyperplanes in highor infnite-dimensional spaces, which can be used for classifcation, regression, or other tasks [33,34]. A hyperplane is said to be optimum or has the best level of generalization of data if it has the largest margin; in other words, the resulting error depends on the size of the margin. In SVM, there are 4 kernels that can be used, namely, the linear kernel, polynomial kernel, radial basis function (RBF) kernel, and sigmoid kernel.
KNN is a classifcation with the type of instance-based learning that works by fnding a number of k patterns (among all the patterns being trained in all classes) closest to the input pattern and then making decisions based on the highest number of patterns among the k value pattern [35].
A decision tree (DT) is a nonparametric-supervised learning method that was used for classifcation and regression with a tree structure [36,37]. Te goal is to create a model that predicts the value of the target variable by studying simple decision rules deduced from data features. A DT takes a set of input data to classify, and it outputs a tree that resembles an orientation diagram where each leaf is a decision (a class) and each nonfnal node (internal) represents a test. During classifcation, only features are being considered in the test pattern, so feature selection is implicit in it. Te most  commonly used decision tree classifcations are binary and use a single feature at each node, resulting in boundary decisions that are parallel to the feature axis. As a result, such decision trees are intrinsically less than optimal for most applications. However, the main advantage of tree classifers, apart from their speed, is the possibility to interpret decision rules in terms of individual features. Tis makes decision trees interesting for researchers to use interactively.
To implement some of the machine learning approaches above, the Scikit-Learn Library module in Python is used [38]; then, performance testing is carried out on each personality in the Big Five model.

Experiment Results
Tis research experiment used all the handwriting images from the IAM handwriting database, with a total of 1539 images. Performance measurement was carried out using the Python programming language [39], the OpenCV library, and the Scikit-Learn library. Tis test was also run on a PC with the following specifcations: GPU processor 9th generation i7, NVIDIA GeForce GTX 1660 Ti, and DDR4 16 GB. Te result of handwriting feature extraction for the entire image is stored in one fle and becomes a labelled data fle for each handwriting image document. Tere are two labels for each model, identifed and not identifed. Performance measurement was performed with machine learning algorithms. Tere are 5 classifcation scenarios carried out, including the SVM (three variations of the kernel: linear, RBF, and polynomial), KNN, and decision tree, with a split ratio of 20 : 80 for testing and training data. Performance testing was performed on each dimension in the Big Five model, namely, neuroticism, openness to experience, extraversion, agreeableness, and conscientiousness.

Performance Measures.
Te classifcation performance measures used for the comparison are accuracy, precision, recall, F1 score, true positive (TP), true negative (TN), false positive (FP), and false negative (FN). Te performance measures are calculated using the following equations, as shown in Table 10. Table 11 presents the data from the classifcation process for the neuroticism model. Te parameters used in the classifcation report are accuracy, precision, recall, and F1 score. From these data, it can be seen that the SVM classifer using the RBF, KNN, and decision tree kernels is able to produce maximum performance for the model. Table 12 shows the data from the classifcation process for the openness to experience model. We still use the same parameters in this classifcation report, with maximum accuracy results using the SVM (RBF kernel), KNN, and decision trees. Te diference is that SVM with a linear kernel is able to produce an accuracy of the model above 90%. Table 13 shows the data from the classifcation process for the extroversion model. From these data, it can be seen that SVM with a linear kernel does not show maximum results with accuracy below 90%. Table 14 shows the data from the classifcation process for the agreeableness model. From these data, like the previous model, it can be seen that SVM with a linear kernel does not show maximum results with accuracy below 90%. Table 15 shows the data from the classifcation process for the conscientiousness model. From these data, SVM with an RBF kernel and a decision tree achieved the highest accuracy with 100%, KNN and SVM with a polynomial kernel obtained 99%, and SVM with a linear kernel achieved the lowest accuracy with 88%. Figure 3 describes the confusion matrix for each model of the Big Five. It can be seen that the amount of data used for testing is 308 or 20 percent of the 1539 handwriting data.

K-Fold Cross-Validation.
Evaluating machine learning models can be very difcult. Typically, we divide the data set into training and test sets and then use a training set to train the model and a test set to test the model. Tis method is very unreliable because the accuracy obtained for one test set can be very diferent from the accuracy obtained for diferent test sets. K-fold cross-validation (CV) provides a solution to this problem by dividing the data into folds and ensuring that each fold is used as a test set at multiple CV points. K-fold CV is a given data set divided into a number of K parts/folds where each fold is used as a test set at some point [40]. Te algorithm used to test the validity of the accuracy results is k � 10 crossvalidation ( Figure 4). Te performance of the classifer model is assessed with two performance metrics: the mean absolute error (MAE) and the root mean square error (RMSE). Table 16 shows the classifer output for each model of the Big Five model using 10-fold cross-validation. In the neuroticism model, the decision tree has the lowest MAE score with a value of 0, the SVM RBF kernel with a value of Straight baseline, small letter size, heavy pen pressure, and extremely reclined slant angle  Figure 5, the decision tree has the average CV score with 100% accuracy, SVM RBF has the average CV score with 99.935% accuracy, the KNN has the average CV score with 98.96%, the SVM polynomial has the average CV score with 89.67%, and SVM linear has the average CV score with 84.149%, respectively. From the data obtained, all classifers have decreased in accuracy by using the 10-fold CV score, except for the decision tree that is relatively stable. Te most signifcant decrease in accuracy is in SVM with a polynomial kernel, from an accuracy of 94% to an accuracy of 89%.
In the openness to experience model, the decision tree has the lowest MAE score with a value of 0.00064, the SVM RBF kernel with a value of 0.01756, the KNN with a value of 0.02338, the SVM polynomial kernel with a value of 0.05459, and the SVM linear kernel with a value of 0.08705, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 6, the decision tree has the average CV score with 99.93% accuracy, SVM RBF has the average CV score with 98.24%, the KNN has the average CV score with 96.48%, the SVM polynomial has the average CV score with 94.54%, and SVM linear has the average CV score with 91.29%, respectively. From the data obtained, the decision tree and SVM RBF classifers have decreased in accuracy by using the 10-fold CV score, but the decrease in the value is not signifcant. It can be     seen with the value of the MAE with a relatively small decrease. Te most signifcant decrease in accuracy is in the KNN, from an accuracy of 100% to 97.76%. In the extroversion model, the decision tree has the lowest MAE score with a value of 0, the SVM RBF kernel with a value of 0.01756, the KNN with a value of 0.03511, the SVM polynomial kernel with a value of 0.06502, and the SVM linear kernel with a value of 0.13455, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 7, the decision tree has the average CV score with 100% accuracy, SVM RBF has the average CV score with 98.50%, the KNN has the average CV score with 96.48%, the SVM polynomial has the average CV score with 93.49%, and SVM linear has the average CV score with 86.54%, respectively. From the data obtained, SVM RBF, SVM linear, and SVM polynomial have decreased in accuracy by using the 10-fold CV score, but the decrease in the value is not signifcant. It can be seen with the value of the MAE with a relatively small decrease. Te decision tree has a stable value for the 10-fold CV score. Te most signifcant decrease in accuracy is in the KNN, from an accuracy of 100% to 96.48%.
In the agreeableness model, the decision tree has the lowest MAE score with a value of 0, the SVM RBF kernel with a value of 0.00454, the KNN with a value of 0.04677, the    Figure 8, the decision tree has the average CV score with 100% accuracy, SVM RBF has the average CV score with 99.54%, the KNN has the average CV score with 95.32%, the SVM polynomial has the average CV score with 98.18%, and SVM linear has the average CV score with 84.73%, respectively. From the data obtained, the most signifcant decrease in accuracy is in the KNN, from an accuracy of 100% to 95.32%. In the conscientiousness model, the decision tree has the lowest MAE score with a value of 0.00129, the SVM RBF kernel with a value of 0.00194, the KNN with a value of 0.02857, the SVM polynomial kernel with a value of 0.02143, and the SVM linear kernel with a value of 0.09163, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 9, the decision tree has the average CV score with 99.87% accuracy, SVM RBF has the average CV score with 99.80%, the KNN has the average CV score with 97.14%, the SVM polynomial has the average CV score with 97.85%, and SVM linear has the average CV score with 90.83%, respectively. From the data obtained, the most signifcant decrease in accuracy is in the KNN, from an accuracy of 99% to 97.14%.

Discussion
From all the data presented, it can be said that SVM with an RBF kernel and decision tree classifers show very promising results. Tis is indicated by the accuracy of the fve models which can be a maximum of above 99%. Te selection of an appropriate image processing algorithm that adapts to the characteristics of the handwriting dataset is very important. In addition, it is equally essential that the selection of the right parameters in the classifcation process can produce good accuracy. Several previous studies also obtained maximum results by using SVM as a classifer, such as a study by Joshi et al. [18], who were able to produce an accuracy of 97%. Tis is one of the advantages of SVM which is very good at classifying two diferent classes. Besides, the selection of the right kernel will afect the results of the classifcation process. KNNs and decision trees also show promising results. Other studies such as by Gavrilescu [12] used the KNN as its classifer with an accuracy of 88.6%, and then, Topaloglu and Ekmekci [17], using decision trees, produced an accuracy of 93.75%. With the deep neural network architecture, Pathak et al. [22] achieved 97.7% accuracy and Bernardo et al. [23] achieved 91.26%, respectively. Te results are described in Table 17.
Although our model has performed well on the IAM data set, it is important to examine the results of our model on another handwriting image dataset, such as the CVL database [41]. We believe that our model has some applicability to identifcation of diferent handwriting images, and for sure, this will be one of our future research directions.

Conclusions
We presented a framework for determining the Big Five personality traits through handwriting analysis features and classifed them using machine learning algorithms. Te automated handwriting analysis helps the graphologist determine human personality traits easier. Tis framework has three main stages which include preprocessing, handwriting feature extraction, and personality classifcation based on mapping from the Big Five models. Te classifcation can be performed using diferent machine learning algorithms, and it is used for the handwriting image database. Tis research is further evaluated through 10-fold cross-validation with key metrics to see the impact on accuracy, and the other performance-measured metrics such as the mean absolute error and root mean square error are discussed. All the metrics show good results, which means that the decision tree and SVM with an RBF kernel are the suitable classifer techniques. Overall, the classifcation accuracy of the framework is higher than that of previous work.
Te authors do acknowledge the current limitations of this research. For example, our model is not currently developed for real systems. Also due to the limitation of the handwriting database, our model does not take into account the amount of classifcation under diferent colours of background handwriting samples.  Author Method Accuracy (%) Gavrilescu [12] ANN, SVM, and KNN 88.6 Topaloglu and Ekmekci [17] Decision tree 93.75 Joshi et al. [18] SVM and template matching 97 Pathak et al. [22] Deep neural network architecture 97.7 Bernardo et al. [23] Hybrid two-stage SqueezeNet and SVM 91.26 Current work (2022) SVM, KNN, and decision tree 99 In future research studies, a novel framework will be designed with diferent psychology measurements such as the MBTI and Enneagram model. Besides, the author will also challenge more complex handwriting databases and apply the model to the real system.

Data Availability
Te dataset is available in a public repository, Computer Vision and Artifcial Intelligence, and can be accessed on the URL: https://fki.tic.heia-fr.ch/databases/iam-handwriting-database.

Conflicts of Interest
Te authors declare that they have no conficts of interest.