The 3D virtual world of “Second Life” imitates a form of real life by providing a space for rich interactions and social events. Second Life encourages people to establish or strengthen interpersonal relations, to share ideas, to gain new experiences, and to feel genuine emotions accompanying all adventures of virtual reality. Undoubtedly, emotions play a powerful role in communication. However, to trigger visual display of user's affective state in a virtual world, user has to manually assign appropriate facial expression or gesture to own avatar. Affect sensing from text, which enables automatic expression of emotions in the virtual environment, is a method to avoid manual control by the user and to enrich remote communications effortlessly. In this paper, we describe a lexical rule-based approach to recognition of emotions from text and an application of the developed Affect Analysis Model in Second Life. Based on the result of the Affect Analysis Model, the developed EmoHeart (“object” in Second Life) triggers animations of avatar facial expressions and visualizes emotion by heart-shaped textures.
Sally Planalp [
Emotions play the role of a sensitive catalyst, which fosters lively interactions between human beings and assists in the development and regulation of interpersonal relationships. The expression of emotions shapes social interactions by providing observers a rich channel of information about the conversation partner [
The richness of emotional communication greatly benefits from the expressiveness of verbal (spoken words, prosody) and nonverbal (gaze, face, gestures, body pose) cues that enable auditory and visual channels of communication [
Nowadays, media for remote online communications and emerging 3D virtual worlds providing new opportunities for social contact grow rapidly, engage people, and gain great popularity among them. The main motivations for “residents” of chat rooms or virtual environments to connect to these media are seeking conversation, experimenting with a new communication media, and initiating relationships with other people. A study conducted by Peris et al. [
To establish a social and friendly atmosphere, people should be able to express emotions. However, media for online communication lack the physical contact and visualization of emotional reactions of partners involved in a remote text-mediated conversation, limiting thus the source of information to text messages and to graphical representations of users (avatars) that are to some degree controlled by a person. Trends show that people often try to enrich their interaction online, introducing affective symbolic conventions or emphases into text (emoticons, capital letters, etc.) [
In this work we address the task of the enhancement of emotional communication in Second Life. This virtual world imitates a form of real life by providing a space for rich interactions and social events. To trigger visual display of a user’s affective state in Second Life, the user has to manually assign appropriate facial expression or gesture to his or her avatar, which can distract the user from the communication process. In order to achieve truly natural communication in virtual worlds, we set a twofold focus in our research:
The remainder of the paper is structured as follows. In Section
The emergence of the field of affective computing [
Physiological biosignals (such as facial electromyograms, the electrocardiogram, the respiration effort, and the electrodermal activity) were analysed by Rigas et al. [
The most challenging tasks for computational linguists are text classification as subjective or of factual nature, determination of orientation and strength of sentiment, and recognition of attitude type expressed in text at various grammatical levels. A variety of approaches have been proposed to determine the polarity of distinct terms [
The ideal method to accurately sense the emotional state of a person contacting others remotely would be to integrate approaches aiming at detection of affective state communicated through different expressive modalities and to obtain a decision based on the weights assigned to these expressive means. Our research is concerned with recognition of emotions reflected in linguistic utterances. In the paper we describe the application of the emotion recognition algorithm in the 3D virtual world Second Life.
In this section, we will summarize the main steps of emotion recognition using our Affect Analysis Model, which was introduced in [
As the purpose of affect recognition in a remote communication system is to relate text to avatar emotional expressions, affect categories should be confined to those that can be visually expressed and easily understood by users. We analysed emotion categorizations proposed by theorists, and as the result of our investigation, for affective text classification, we decided to use the subset of emotional states defined by Izard [
In order to support the handling of abbreviated language and the interpretation of affective features of lexical items, the Affect database was created. The Affect database includes the following tables: Emoticons, Abbreviations, Adjectives, Adverbs, Nouns, Verbs, Interjections, and Modifiers. The affective lexicon was mainly taken from WordNet-Affect [
While constructing our lexical rule-based approach to affect recognition from text, we took into account linguistic features of text written in a free informal style [
In the first stage of the Affect Analysis Model, we test the sentence for emoticons, abbreviations, interjections, “?” and “!” marks, repeated punctuation, and capital letters. Several rules are applied to define the dominant emotion in cases when multiple emoticons and emotion-relevant abbreviations occur in a sentence. As interjections are added to sentences to convey emotion (e.g., “
The second stage is devoted to the analysis of syntactical structure of sentences, and it is divided into two main subtasks. First, sentence analysis based on the GNU GPL licensed Stanford Parser (
In the third stage, for each word found in our database, the affective features of a word are represented as a vector of emotional state intensities e
The purpose of this stage is to detect emotions involved in phrases, and then in Subject, Verb, or Object formations. We have defined rules for processing phrases: adjective phrase: modify the vector of adjective (e.g., e(“ noun phrase: output vector with the maximum intensity within each corresponding emotional state in analysing vectors (e.g., e1 verb plus adverbial phrase: output vector with the maximum intensity within each corresponding emotional state in analysing vectors (e.g., e(“ verb plus noun phrase: if verb and noun phrase have opposite valences (e.g., “ verb plus adjective phrase (e.g., “
The rules for modifiers are as follows:
Each of the Subject, Verb, or Object formations may contain words conveying emotional meaning. During this stage, we apply the described rules to phrases detected within formation boundaries. Finally, each formation can be represented as a unified vector encoding its emotional content.
The emotional vector of a simple sentence (or a clause) is generated from Subject, Verb, and Object formation vectors resulting from phrase-level analysis. The main idea here is to first derive the emotion vector of Verb-Object formation relation. It is estimated based on the “verb plus noun phrase” rule described above. In order to apply this rule, we automatically determine valences of Verb and Object formations using their unified emotion vectors (particularly, nonzero-intensity emotion categories). The estimation of the emotion vector of a clause (Subject plus Verb-Object formations) is then performed in the following manner: (1) if valences of Subject formation and Verb formation are opposite (e.g., SF
To estimate the emotional vector of a compound sentence, first, we evaluate the emotional vectors of its independent clauses. Then, we define the resulting vector of the compound sentence based on two rules:
In order to process a complex sentence with a complement clause (e.g., “
While processing complex-compound sentences (e.g., “
In order to evaluate the performance of the Affect Analysis Model and to compare out method with related work, we conducted a set of experiments on data sets extracted from blogs.
To measure the accuracy of the proposed emotion recognition algorithm with the freely available Stanford parser [
Three independent annotators labelled the sentences with one of nine emotion categories (or neutral) and a corresponding intensity value. For the evaluation of algorithm performance, we created two collections of sentences corresponding to different “gold standards”:
The distributions of emotion labels across “gold standard” sentences.
Gold standard | Number/percent of sentences with distinct emotion labels | ||||||||||
neutral | anger | disgust | fear | guilt | interest | joy | sadness | shame | surprise | total | |
2-3 annotators agreed | 75/11.4 | 59/9.0 | 30/4.6 | 49/7.5 | 22/3.4 | 43/6.6 | 181/27.6 | 145/22.1 | 9/1.4 | 43/6.6 | 656/100 |
3 annotators agreed | 8/3.2 | 17/6.8 | 9/3.6 | 24/9.6 | 12/4.8 | 8/3.2 | 88/35.3 | 58/23.3 | 3/1.2 | 22/8.8 | 249/100 |
The performance of the Affect Analysis Model (AAM) employing Stanford Parser was evaluated against both sets of sentences related to “gold standards.” Averaged accuracy, precision, recall, and F-score are shown in Table
The results of experiment with Affect Analysis Model employing Stanford Parser.
Gold standard | Measure | Fine-grained emotion categories | Merged labels | |||||||||||
neut | ang | disg | fear | guilt | inter | joy | sad | sh | sur | Pos | Neg | Neut | ||
2-3 annotators agreed | Averaged accuracy | |||||||||||||
Precision | 0.30 | 0.77 | 0.64 | 0.74 | 0.71 | 0.61 | 0.83 | 0.74 | 0.50 | 0.76 | 0.84 | 0.91 | 0.28 | |
Recall | 0.55 | 0.34 | 0.70 | 0.80 | 0.55 | 0.81 | 0.71 | 0.64 | 0.67 | 0.72 | 0.80 | 0.75 | 0.55 | |
F-score | 0.39 | 0.47 | 0.67 | 0.76 | 0.62 | 0.70 | 0.76 | 0.69 | 0.57 | 0.74 | 0.82 | 0.82 | 0.37 | |
3 annotators agreed | Averaged accuracy | |||||||||||||
Precision | 0.15 | 0.92 | 0.83 | 0.87 | 0.80 | 0.50 | 0.96 | 0.88 | 0.50 | 0.82 | 0.94 | 0.98 | 0.08 | |
Recall | 0.75 | 0.65 | 0.56 | 0.83 | 0.67 | 0.75 | 0.78 | 0.74 | 0.33 | 0.82 | 0.85 | 0.79 | 0.75 | |
F-score | 0.24 | 0.76 | 0.67 | 0.85 | 0.73 | 0.60 | 0.86 | 0.80 | 0.40 | 0.82 | 0.89 | 0.88 | 0.14 |
We also evaluated the system performance with regard to estimation of emotion intensity. The percentage of emotional sentences (not considering neutral ones), on which the result of our system conformed to the “gold standards”, according to the measured distance between intensities given by human raters (averaged values) and those obtained by the Affect Analysis Model is shown in Table
Percentage of emotional sentences according to the range of intensity difference between human annotations and output of algorithm.
Gold standard | Percentage of sentences according to the range of intensity difference (%) | ||||
[0.0–0.2] | (0.2–0.4] | (0.4–0.6] | (0.6–0.8] | (0.8–1.0] | |
2-3 annotators agreed | 48.8 | 30.6 | 16.6 | 3.9 | 0.0 |
3 annotators agreed | 51.4 | 27.6 | 17.1 | 3.9 | 0.0 |
Examples of sentences and their annotations.
Sentence | Annotations | |
annotator 1/annotator 2/annotator 3 | result of AAM | |
anger:0.6/anger:1.0/neutral:0.0 | anger:0.51 | |
disgust:0.6/disgust:0.7/neutral:0.0 | disgust:0.32 | |
fear:0.8/fear:0.5/fear:0.9 | fear:0.32 | |
guilt:0.7/guilt:0.9/guilt:1.0 | guilt:0.77 | |
interest:0.7/anger:1.0/interest:0.8 | interest:0.96 | |
joy:0.4/neutral:0.0/joy:0.8 | joy:0.48 | |
sadness:0.2/sadness:0.2/neutral:0.0 | sadness:0.32 | |
guilt:0.7/shame:0.7/shame:1.0 | shame:0.38 | |
surprise:0.8/surprise:1.0/surprise:1.0 | surprise:0.4 | |
neutral:0.0/neutral:0.0/anger:0.5 | neutral:0.0 |
The analysis of the failures of Affect Analysis Model revealed that common sense or additional context is required for processing some sentences. For example, human annotators agreed on the “sadness” emotion conveyed through “
It is worth noting, however, that the accuracy of the Affect Analysis Model with the (commercially available) parser (Connexor Machinese Syntax) used in our previous work was higher in 6%–8% on the same sets of sentences (see details of comparison in Table
Comparison of accuracy of Affect Analysis Model employing different parsers (Connexor Machinese Syntax versus Stanford Parser).
Measure | Gold standard | |||
2-3 annotators agreed | 3 annotators agreed | |||
Fine-grained emotions | Merged labels | Fine-grained emotions | Merged labels | |
656 sentences, Kappa | 692 sentences, Kappa | 249 sentences, Kappa | 447 sentences, Kappa | |
Accuracy of AAM with Connexor Machinese Syntax | 0.726 | 0.816 | 0.815 | 0.890 |
Accuracy of AAM with Stanford Parser | 0.649 | 0.747 | 0.751 | 0.814 |
Difference in % | 7.7 | 6.9 | 6.4 | 7.6 |
This emotion blog data set was developed and kindly provided by Aman and Szpakowicz [
Distribution of labels across sentences from benchmark used in the experiment.
Labels | Number of sentences |
---|---|
joy | 536 |
sadness | 173 |
anger | 179 |
disgust | 172 |
surprise | 115 |
fear | 115 |
neutral | 600 |
AAM is capable of recognizing nine emotions, whereas the methods described in [
Results of AAM compared to machine learning methods proposed by Aman and Szpakowicz [
Algorithm | Measure | joy | sadness | anger | disgust | surprise | fear | neutral |
---|---|---|---|---|---|---|---|---|
AAM | Averaged accuracy | 0.770 | ||||||
Precision | 0.758 | 0.785 | ||||||
Recall | ||||||||
F-score | ||||||||
ML with unigrams | Precision | 0.840 | 0.619 | 0.634 | 0.772 | 0.581 | ||
Recall | 0.675 | 0.301 | 0.358 | 0.453 | 0.339 | 0.487 | 0.342 | |
F-score | 0.740 | 0.405 | 0.457 | 0.571 | 0.479 | 0.629 | 0.431 | |
ML with unigrams, RT features, and WNA features | Precision | 0.813 | 0.605 | 0.650 | 0.672 | 0.723 | 0.868 | 0.587 |
Recall | 0.698 | 0.416 | 0.436 | 0.488 | 0.409 | 0.513 | 0.625 | |
F-score | 0.751 | 0.493 | 0.522 | 0.566 | 0.522 | 0.645 | 0.605 |
The obtained results (precision, recall, and F-score) revealed that our rule-based system outperformed both machine learning methods in automatic recognition of “joy”, “sadness”, “anger”, “disgust”, and “neutral”. In case of “surprise” and “fear” emotions, “ML with unigrams” resulted in higher precision, but lower recall and F-score than our AAM.
Emotional expression is natural and very important for communication in real life but currently rather cumbersome in the 3D virtual world Second Life, where expressions have to be selected and activated manually. Concretely, a user has to click on the animation gesture in the list or type the predefined command following the symbol “/” in a textual chat entry. In order to breathe emotional life into graphical representations of users (avatars) through the automation of emotional expressiveness, we applied the developed Affect Analysis Model to textual chat in Second Life. The architecture of the system is presented in Figure
Architecture of the EmoHeart system.
The control of the conversation is implemented through the Second Life object called EmoHeart (
Of the bodily organs, the heart plays a particularly important role in our emotional experience. People often characterize personal traits, emotional experiences, or mental states using expressions originating from word “
Examples of avatar facial expressions and EmoHeart textures.
While designing EmoHeart textures, we followed the description of main characteristic features of expressive means in relation to communicated emotion (Table
Emotional states and relevant expressive means (data partially taken from [
Emotion | Expressive means |
---|---|
Anger | widely open eyes, fixated; pupils contracted; stare gaze; ajar mouth; teeth usually clenched tightly; rigidity of lips and jaw; lips may be tightly compressed, or may be drawn back to expose teeth |
Disgust | narrowed eyes, may be partially closed as result of nose being drawn upward; upper lip drawn up; pressed lips; wrinkled nose; turn of the head to the side quasi avoiding something |
Fear | widely open eyes; pupils dilated; raised eyebrows; open mouth with crooked lips; trembling chin |
Guilt | downcast or glancing gaze; inner corners of eyebrows may be drawn down; lips drawn in, corners depressed; head lowered |
Interest | eyes may be exaggeratedly opened and fixed; lower eyelids may be raised as though to sharpen visual focus; increased pupil size; sparkling gaze; mouth slightly smiling; head is slightly inclined to the side |
Joy | “smiling” and bright eyes; genuinely smiling mouth |
Sadness | eyelids contracted; partially closed eyes; downturning mouth |
Shame | downcast gaze; blushing cheeks; head is lowered |
Surprise | widely open eyes; slightly raised upper eyelids and eyebrows; the mouth is opened by the jaw drop; the lips are relaxed |
We made EmoHeart available for Second Life users from December 2008. During a two-month period (December 2008 – January 2009), we asked students to promote the EmoHeart object by visiting locations in Second Life and engaging other Second Life residents in social communication. As a result, 89 Second Life users became owners of EmoHeart, and 74 of them actually communicated using it. Text messages along with the results from Affect Analysis Model were stored in an EmoHeart log database. Some general statistics is given in Table
Statistics on EmoHeart log of 74 users for period December 2008 – January 2009.
Measure | Messages, number | Message length, symbols | Sentences, number |
---|---|---|---|
Total | 19591 (for all users) | 400420 (for all messages) | 21396 (for all messages) |
Minimal | 1 (for user) | 1 (for message) | 1 (for message) |
Maximal | 2932 (for user) | 634 (for message) | 25 (for message) |
Average | 265 (per user) | 20 (per message) | 1.09 (per message) |
From all sentences, 20% were categorized as emotional by the Affect Analysis Model and 80% as neutral (Figure
Percentage distribution of emotional (positive or negative) and neutral sentences.
We analysed the distribution of emotional sentences from EmoHeart log data according to the fine-grained emotion labels from our Affect Analysis Model (Figure
Percentage distribution of sentences with fine-grained emotion annotations.
As the Affect Analysis Model also enables detection of five communicative functions (besides nine distinct affective states) that are frequently observed in online conversations, we analysed the communicative functions identified in the EmoHeart log data as well. The percentage distribution of detected communicative functions is shown in Figure
Percentage distribution of five communicative functions.
This paper introduced the integration of the developed emotion recognition module, Affect Analysis Model, into the 3D virtual world Second Life. The proposed lexical rule-based algorithm to affect sensing from text enables analysis of nine emotions at various grammatical levels. For textual input processing, our Affect Analysis Model handles not only correctly written text but also informal messages written in an abbreviated or expressive manner. The salient features of the Affect Analysis Model are the following:
analysis of nine emotions on the level of individual sentences: this is an extensive set of labels if compared to six emotions mainly used in related work,
the ability to handle the evolving language of online communications: to the best of our knowledge, our approach is the first attempt to deal with informal and abbreviated style of writing, often accompanied by the use of emoticons,
foundation in database of affective words (each term in our Affect database was assigned at least one emotion label along with emotion intensity, in contrast to annotations of one emotion label or polarity orientation in competing approaches), interjections, emoticons, abbreviations and acronyms, modifiers (which influence on degrees of emotion states),
vector representation of affective features of words, phrases, clauses, and sentences,
consideration of syntactic relations and semantic dependences between words in a sentence: our rule-based method accurately classifies context-dependent affect expressed in sentences containing emotion-conveying terms, which may play different syntactic and semantic roles,
analysis of negation, modality, and conditionality: most researchers ignore modal expressions and condition prepositions, therefore, their systems show poor performance in classifying neutral sentences, which is, indeed, not easy task,
consideration of relations between clauses in compound, complex, or complex-compound sentences: to our knowledge, AAM is the first system comprehensively processing affect reflected in sentences of different complexity,
emotion intensity estimation: in our work, the strength of emotion is encoded through numerical value in the interval [0.0, 1.0], in contrast to low/middle/high levels detected in some of competing methods.
Our system showed promising results in fine-grained emotion recognition in real examples of online conversation (diary-like blog posts):
In Second Life, the Affect Analysis Model serves as the engine behind automatic visualization of emotions conveyed through textual messages. The control of the conversation in Second Life is implemented through the EmoHeart object attached to the avatar’s chest. This object communicates with Affect Analysis Model located on the server and visually reflects the sensed affective state through the animation of avatar’s facial expression, EmoHeart texture, and size of the texture. In the future, we aim to study cultural differences in perceiving and expressing emotions and to integrate a text-to-speech engine with emotional intonations into textual chat of Second Life.
The authors would like to acknowledge and thank Alessandro Valitutti and Dr. Diego Reforgiato for their kind help during the Affect database creation. They wish also to express their gratitude to Dr. Dzmitry Tsetserukou, Dr. Shaikh Mostafa Al Masum, Manuel M. Martinez, Zoya Verzhbitskaya, Hutchatai Chanlekha, and Nararat Ruangchaijatupon who have contributed to annotations of Affect database entries and sentences, for their efforts and time. Special thanks also go to Cui Xiaoke, Tananun Orawiwattanakul, and Farzana Yasmeen for their work on EmoHeart promotion in Second Life. This research was partly supported by a JSPS Encouragement of Young Scientists Grant (FY2005-FY2007), an NII Joint Research Grant with the University of Tokyo (FY2007), and an NII Grand Challenge Grant (FY2008-FY2009).