Research on the Human Dynamics in Mobile Communities Based on Social Identity

. Through analyzing the data about the releases, comment, and forwarding of 120,000 microblog messages in a year, this paper ﬁnds out that the intervals between information releases and comment follow a power law; besides, the analysis of data in each 24 hours reveals obvious di ﬀ erences between microblogging and website visit, email, instant communication, and the use of mobile phone, reﬂecting how people use fragments of time via mobile internet technology. The paper points out the signiﬁcant inﬂuence of the user’s activity on the intervals of information releases and thus demonstrates a positive correlation between the activity and the power exponent. The paper also points out that user’s activity is inﬂuenced by social identity in a positive way. The simulation results based on the social identity mechanism ﬁt well with the actual data, which indicates that this mechanism is a reasonable way to explain people’s behavior in the mobile Internet.


Introduction
In traditional studies, it is usually assumed that people's behaviors are random in time and thus can be simply described as Poisson processes.However, as the ways of data collection and capability of data processing are becoming significantly improved, more and more empirical studies on people's behaviors prove that many of them deviate from the Poisson distribution.Studies in this field mainly fall into two categories, namely, empirical study on the statistical features of people's behaviors in different contexts especially those in the Internet context; and the study on theoretical models interpreting people's behaviors.This paper focuses on features of people's behaviors in the mobile Internet as well as the theoretical model concerned.
As for the empirical study on statistical features of people's behaviors, Barabási researched first the interval distribution of behaviors like sending emails, waiting for reply their memories 7 .Early replays cast no impact on later visitors, and people just reply as they like.
People behave because of complex motivations, and not all things are tasks, for example, browsing and ordering movies, but such behaviors follow certain statistical laws too.According to Shang and other scholars, people's interest in new things wanes or even disappears due to frequent involvement, but may suddenly revive after lasting indifference.The change of people's interest may cause the heavy-tail distribution of their behaviors.Shang and others proposed to quantify people's interest by the probability of occurrence during intervals 22 .Han and others also noticed the fact that people's interest in a certain activity may be changed due to their feelings, and thus proposed the self-adapting human dynamic mechanism 23 .Guo held that blogging is becoming less attractive as people are involved more and more 24 .Wu and others established a human dynamic model, holding that individuals' behaviors in online commenting system are influenced by other individuals 25 .
Existing studies have proved that the Internet has laid a sound data foundation for human dynamic research, and its wide use in people's daily life also determines the significance of Internet-based human dynamics research.However, the above-mentioned research results are challenged with the rapid development of the mobile Internet in recent years.
In the context of a mobile Internet, people's behaviors in fragments of time are hardly "task oriented" as they perform according to what they see and feel.This shows that the priority-based task queue model is limited when explaining such behaviors.A dynamic model with memory, as well as an interest-driven dynamic model, is also limited due to the fragmentation of time and content for behaviors in the mobile Internet.Moreover, most users of the mobile Internet are youngsters who try to express their unique personality on the one hand and on the other hand desire a social identity which is manifested in behaviors like online comment, collecting, ordering, and forwarding.Therefore, the social identity as a driving force might supplement interest-driven human dynamics research.
This paper analyzes data about microblog on http://www.sina.com/,which is among the largest Chinese websites for microblog in the world.Empirical analysis shows that the distribution of intervals between information releasing on microblog follows a power law and verifies a positive correlation between user's activity and the power exponent.Based on rescaling method, our research presents a universal behavior for users with different activities, which is in consistency with the results in 10, 13, 15 .The paper assumes that the activity of individual user is influenced by social identity and quantifies the social identity to improve the memory model 5 .The simulation results are in consistency with the empirical data.
The remainder of the paper is organized as follows.Section 2 presents the empirical analysis results.Section 3 analyzes the influence of user's activity on statistical features of people's behaviors, proposes a social identity mechanism, and compares the simulationbased results with empirical data.Finally, the conclusions of the whole paper are presented in Section 4.  aggregated many users at current stage, we collected all messages about an entertainment topic, ranging from August 20, 2009 to September 3, 2010.During this period, there are totally 125,152 messages released by 175 users, which have been forwarded 2,260,826 times and triggered 1,786,000 comments.

Statistic Features of Empirical Data
To analyze the releasing behavior, releasing intervals of each individual are calculated separately.Then intervals of all individuals are aggregated to figure out the distribution.For example, suppose Table 1 is the releasing information of all individuals.Each line of the table presents the releasing time of every message from corresponding users.For user u1, releasing intervals are t3-t1 , t4-t3 , and so on.For user u2, releasing intervals are t5-t2 , t6-t5 , and so on.Intervals of comments are calculated similarly.
According to the data collected, the interval distributions of both releasing and commenting follow power law in the log-log grid, as indicated in Figure 1.
According to Figure 1, the power-law exponents of message releasing and commenting intervals are almost the same, which is consistent with the results of 16, 17 .Individual's message releasing behavior is driven by commenting behavior of others to some extent.Thus both the two distributions decay with the same exponent.However, the exponent 1.34 in this paper is higher than that in the research about email communication 16 , and lower than the exponent concluded in 17 .In email communication, spam and irrelevant emails often mix with normal emails, which may cause the long tail of the interval distribution.However, in the one hand, microblog messages in http://www.sina.com/are organized by topics, and  communications in microblog communities are in a way of multipoint to multipoint, which improves the frequency of communications.In the other hand, microblog is based on mobile Internet technologies that enable individuals to release messages at anytime and anywhere.The differences in the way of communication and technologies may cause the diversity of the power law exponents in different scenarios.
Figure 2 explains the correlation between the number of comments and the times of forwarding.x refers, the times of forwarding to and y means the number of comments.When x is lower than 2, 500, y 2.2x.When y is lower than 2, 500, y 0.25x.There is almost no information in the area where both x and y are above 2,500.In over 95% of all cases, both x and y are lower than 500, and the two are not obviously correlated.On this basis, messages on microblogs can be divided into two groups: those more frequently forwarded while less commented, which are intended to inform more individuals rather than attract comments, and those more commented but less forwarded, which may arouse extensive discussion, instead of wide spread.
As indicated in Figure 3 about the analysis of microblog data over 24 hours, information releases peak between 11 am and 12 am both at weekends and on weekdays, and that is quite different from the peak of using email, instant communication devices, and mobile phones which is usually at 10 am.This lagging peak of microblog indicates that most messages on microblogs have nothing to do with work and are just small talks in break time.
Figure 4 illustrates the releasing time of all messages from the 175 users in one day.Two peak time periods, namely, 11:00-13:00 and 23:00-01:00, again reflect the features of using microblog in fragments of time.

Influence of User's Activity on Statistical Features
The intensity of messaging activity has significant influence on distribution of intervals of messaging, and the more active the user is, the larger the power exponent for interval The figure is about the number of messages posted on microblog in 24 hours."1" on the X-axis indicates from "0" to "1" o'clock, "2" means from "1" to "2", and so on.The weekday curve represents the average number between Monday and Friday.The Y -axis is the number of messages.
00:00:00 06:00:00 12:00:00 18:00:00 00:00:00 0 distribution will be.Suppose the activity of user i is A i n i /T i , n i indicating the total sum of messages sent by the user, T i being the total time spent on by all behaviors.In this paper, A i means the number of messages sent by user i per day.To explore the role of user's activity in the online social commenting system, users sending over 1,000 messages are picked put.As shown in Figure 5, the average activity of these users is A 12.04.These users are arranged in descending order of their activity and then divided into 5 groups with each including the same number of members.As for the average activity of each group, 3.1 Figure 6 illustrates the interval distributions of the five groups.According the figure, the decay exponents depend on the activity of the users, and therefore the interval distributions corresponding to different level of activity are more representative than a global one.Lower average activity corresponds to lower power exponent and thus allows longer intervals, which is in consistency with the conclusion in 12 .The correlation between exponents and corresponding average activities is presented in Figure 7.
The variation of activity reflects the difference of individuals' behavior patterns.Here we use recalling method in order to describe the universal behavior of different groups.Instead of considering the values for interval t, we take into account the rescaled variable t/Δt; Δt represents the average interval of the respective group of users.As illustrated in Figure 8, the scaling produces a data collapse between the different curves of five groups.This phenomenon has also been noticed in other systems 10, 13, 15 .

Influencing Mechanism of Social Identity on Information Releases Behaviors
To analyze the influences on user's activity in mobile communities, this paper introduces social identity which is quantified and used to improve the memory model proposed by Vázquez.Based on intuitive experience on users' messaging habits, if messages from user i are recognized by other users, namely, if such messages are commented or forwarded, user i would like to send more messages, otherwise he may become less interested.According to the equation proposed in reference 21 , the parameter a controls the degree and type of intuitive perception.In this paper, p means the probability of sending another message after the user's previous message is commented or forwarded.In case of no comment or forwarding, the possibility of sending another message is 1 − p.The value of p approximates the ratio between the number of messages commented or forwarded and the total number of messages sent by the user.From 3.2 we can get the probability density function following power law:  In case of comment or forwarding, the social identity mechanism promotes users to send more messages a > 1 , and the corresponding power exponent is  If no comment or forwarding occurs, the social identity is discouraging a < 1 , and the corresponding power exponent is α 2 − a 1 − a . 3.6

Comparison between Simulation Results and Empirical Data
Two typical users are picked out, respectively, from group 2 and group 4 and are marked as user 2 and user 4. Their information is as shown in Table 2.
The social identity mechanism accelerates message sending in case of 0.7 < p < 0.9, then the range of corresponding power exponent obtained via formula 3.5 is 2.11 < α < 2.43.While the social identity mechanism is discouraging, the range of corresponding power exponent calculated according to the same formula is 1.57< α < 1.89.
Figure 9 is about the distribution of intervals between messages sent by user 2 under the social identity mechanism.Figure 9 a tells the distribution of intervals between messages sent by user 2 in case of comment or forwarding, with p 0.72 and α 2.34. Figure 9 b shows the distribution of intervals between messages in case of no comment or forwarding, with 1 − p 0.28 and α 1.70.The values of power exponents are in ranges calculated when the social identity mechanism is encouraging or discouraging.
Figure 10 deals with the distribution of intervals between messages sent by user 4 under the social identity mechanism.Figure 10 a tells the distribution of intervals between messages sent by user 4 in case of comment or forwarding, with p 0.8 and α 2.14.The model-based results fit well with the empirical data, indicating the influence of social identity on the activity of individual users.Namely, the higher the social identity is, the more active the user will be, and vice versa.

Conclusions
In recent years, many studies have been conducted on the statistical law of people's behavior in the context of the Internet and theoretical mechanisms concerned and found that many behaviors follow power-law distribution in terms of statistical features about time.This paper analyzes microblog and reveals the law of information releases in mobile communities.
Analysis of over 120,000 messages sent through microblog at http://www.sina.com/reveals a power-law interval distribution; besides, the law of microblogging differs from that of email, instant communication, and mobile phone calls, according to the 24-hour behaviors analysis.The mobile internet technology-based microblog enables users to send messages by mobile phone whenever and wherever they like, rather than only via computer.This paper explores the features of using mobile Internet technology in fragments of time and the difference between that and online browsing in terms of time and space.In addition, it points out the significant influence of user's activity on distribution of intervals between information releases and testifies the positive correlation between the user's activity and the power exponent.What is more important is that it finds that the social identity has direct influence on the user's activity in mobile communities: the higher the social identity is, the more active the user will be, and vice versa.Results from social identity-based simulation are in consistency with actual data, indicating that the social identity mechanism is a proper way to interpret people's behaviors in the context of the mobile internet.

Figure 1 :
Figure 1: Distribution of intervals a distribution of intervals between initial releases b distribution of intervals between comments.

Figure 2 :
Figure 2: Correlation between comments and forwarding.

Figure 3 :
Figure3: Microblog data over a 24 h period.Note.The figure is about the number of messages posted on microblog in 24 hours."1" on the X-axis indicates from "0" to "1" o'clock, "2" means from "1" to "2", and so on.The weekday curve represents the average number between Monday and Friday.The Y -axis is the number of messages.

Figure 4 :
Figure 4: The releasing time of messages sent by all 175 users in one day.

Figure 5 :
Figure 5: Intervals of messaging for all users sending over 1,000 messages.

Figure 7 :
Figure 7: Correlation between exponents and corresponding average activities.

5 Figure 8 :
Figure 8: The interval distributions of every group.

7 bFigure 9 :
Figure 9: Distribution of intervals between messages sent by user 2 under the social identity mechanism.

6 bFigure 10 :
Figure 10: Distribution of intervals between messages sent by user 4 under the social identity mechanism.

Figure 10 b
Figure 10 b shows the distribution of intervals between messages in case of no comment and forwarding, with 1 − p 0.20 and α 1.60.The values of power exponents are in ranges calculated when the social identity mechanism is encouraging or discouraging.The model-based results fit well with the empirical data, indicating the influence of social identity on the activity of individual users.Namely, the higher the social identity is, the more active the user will be, and vice versa.

Table 1 :
An example of message releasing behaviors.

Table 2 :
Messaging data of user 2 and user 4.