Complexity in Individual Trajectories toward Online

Society faces a fundamental global problem of understanding which individuals are currently developing strong support for some extremist entity such as ISIS (Islamic State), even if they never end up doing anything in the real world. The importance of online connectivity in developing intent has been confirmed by recent case studies of already convicted terrorists. Here we use ideas from Complexity to identify dynamical patterns in the online trajectories that individuals take toward developing a high level of extremist support, specifically, for ISIS. Strong memory effects emerge among individuals whose transition is fastest and hence may become “out of the blue” threats in the real world. A generalization of diagrammatic expansion theory helps quantify these characteristics, including the impact of changes in geographical location, and can facilitate prediction of future risks. By quantifying the trajectories that individuals follow on their journey toward expressing high levels of pro-ISIS support—irrespective of whether they then carry out a real-world attack or not—our findings can help move safety debates beyond reliance on static watch-list identifiers such as ethnic background or immigration status and/or postfact interviews with already convicted individuals. Given the broad commonality of social media platforms, our results likely apply quite generally; for example, even on Telegram where (like Twitter) there is no built-in group feature as in our study, individuals tend to collectively build and pass through the so-called super-group accounts.


Introduction
Relatively unsophisticated but high-impact attacks such as in Manchester, Stockholm, Paris, London, and New York in 2017, and Berlin, Nice, Brussels, and Orlando in 2016, look set to become a fact of life [1][2][3] given the difficulty in detecting potential perpetrators who may act "out of the blue" anywhere in the world.One extremist source is offering $50,000 to anyone, anywhere, just for attempting such an attack [2].A fundamental problem faced by security agencies is how to move as far as possible "left of boom" in order to detect individuals who are currently developing intent in the form of strong support for some extremist entity, even if they never end up doing anything in the real world.The importance of online connectivity in developing intent [4][5][6][7][8][9][10][11][12] has been confirmed by recent case studies of already convicted terrorists by Gill and others [6,7].Quantifying this online dynamical development can help move beyond static watch-list identifiers such as ethnic background or immigration status.
Here we address this problem by analyzing through the lens of Complexity, a unique dataset of online activity involving the global population of ∼350 million users of the social media outlet VKontakte (https://www.vk.com).VKontakte became the primary online social media source for ISIS propaganda and recruiting before moderator pressure forced activity toward encrypted alternatives such as Telegram in late 2015 [13].Our experimental design and data gathering follow [12].Unlike Facebook which squashes such activity almost immediately, support for ISIS on VKontakte develops around online groups which are akin to everyday Facebook groups supporting a particular enterprise in sport, politics, or education [12].Hence the term "group" simply means an online social media group.In Facebook, for example, it is easy and common for people to create such a group, and VKontakte simply copied this feature.These online groups keep themselves open-source in order to attract new members; hence we are able to record current membership at every instant.Our hybrid system of application program 2 Complexity interfaces (APIs) backed up by intensive manual crosschecking shows that 91,781 of the ∼350 million VKontakte users were members of at least one online pro-ISIS group (Figure 1(a)) at the start of our study (1 Jan 2015) or became members during it (i.e., after 1 Jan 2015).Our method for determining a pro-ISIS group, together with explicit examples of its pro-ISIS content, is given in the Supplemental Information (SI) (available here).While our dataset is not provably complete or error-free, we believe that our approach of identifying online groups to unravel and quantify the trajectories of individual supporters [12] is as close as one can come without having to sift through all ∼350 million users one by one and without access to classified or private information.
Given the broad commonality of social media platforms, the trajectory results that we report here likely apply more generally; for example, even on Telegram where (like Twitter) there is no built-in group feature, individuals tend to collectively build and pass through the so-called supergroup accounts [11].In our analysis, we will make use of both clock-time and event-time since human activity can be measured in either of these complementary ways.Specifically, clock-time measures the number of days that have passed until the present instant, irrespective of what has happened during these days in terms of events; by contrast, event-time measures the number of events that have happened until the present instant, irrespective of how many days have passed.In each case, we state explicitly whether the quantity of interest is being measured in clock-time or event-time.

Moderator Bans.
Both individual accounts and groups become physically banned by VKontakte moderators when they become too extreme in their support of pro-ISIS violence.The moderators are responsible for enforcing the rules of VKontakte which state that individual accounts and groups will be banned if they issue calls to violent actions.This is shown explicitly in Figure 1(b), where a group that has been banned then has this statement appearing in place of its regular content.Our observation of the content of groups that become banned shows that they previously posted material supporting specific ISIS operations or specific calls to attack certain places or sectors of society or spread material with these pro-ISIS messages.These count as being too extreme, as evidenced by the group then becoming banned.Since the act of banning produces a definite online announcement (e.g., Figure 1(b)), it provides us with a well-defined measure of when an individual or group reaches a high level of pro-ISIS support.We can therefore unambiguously classify each individual in our dataset while he/she is still developing as a "future-ban" individual (b) if he/she in the future reaches a high level of pro-ISIS support (i.e., become banned).Similarly, we can unambiguously classify each pro-ISIS group as a future-ban group (B) if it eventually gets banned (and A otherwise).During his/her development, each individual may join any number of B and/or A groups, and at any one moment may be fluctuating in terms of tending toward or away from this high level of extremist expression and hence toward or away from becoming banned.We note that the banning of a group does not mean that all its members will necessarily become banned in the future.It can happen that only one or two of the group members were responsible for the content that got the group banned; hence the moderators do not have a reason to ban all the group members.Though our focus here is on the online development of "futureban" individuals irrespective of whether they later carry out an extremist act or not, our analysis of media reports together with other individuals' mentions of usernames suggests that a significant number do.For example, the user list includes an eventual combatant who produced real-time audio recordings with street-level detail during assaults in Syria; an eventual suicide bomber who seems to have driven a truck of explosives into a Shia army in Iraq; an individual whose eventual combat activity in Iraq made headlines [14]; and an individual who transitioned to become the leader of Chechen fighters within ISIS.

Trajectories.
Of these 91,781 individuals, 7,707 later develop such extreme online support that their individual account becomes banned by VKontakte; that is, there are 7,707 future-ban individuals.Hence the probability that a given developing individual will eventually reach a high level of extremist support (i.e., become banned) can be estimated from the frequency of occurrences in the data as Π = 7707/91781 = 0.084.We note that the 7,707 future-ban individuals do not all have their first joining event at the same time; instead, their first joining dates are spread over the whole period.Each developing individual's trajectory can be represented as a binary chain of group joining (e.g., B→  →  →  → A ⋅ ⋅ ⋅ as in Figure 1(a)).The banning of groups occurs far more frequently than the banning of individuals [12].It often happens that the collective content of a group quickly becomes very extreme, while the postings of any particular member may not.Hence the group gets banned at a given instant while the individual(s) does not.As a consequence, individuals who join lots of future-ban () groups are not necessarily the ones having their accounts banned.For example, the user in Figure 1(a) joins 34  groups and only 2  groups but never becomes banned because his/her individual postings-while actively supporting ISIS (see SI)-are not judged to be sufficiently extreme.Indeed, among all the individuals who join 10 or more future-ban () groups, only 1,413 are future-ban individuals while 3,619 are not, meaning that the trajectory of future-ban individuals is not simply driven by the process of joining as many futureban () groups as possible.We stress that we do not attempt to classify or identify certain future-ban groups as being the first future-ban groups of future-ban individuals.Instead we focus on identifying the future-ban individuals by treating the classification of the groups as given information and studying the transitions and trajectories of these users among the classified groups.

Diagrammatic Expansion.
Motivated by the physics approach to understanding transition probabilities in terms of successively higher-order interactions using diagrammatic expansion theory [15], we unravel the contributions to the Figure 1: Individuals' online trajectories.(a) Schematic of group-joining in the complex pro-ISIS online space https://www.vk.com which comprises a mix of individuals (colored squares), groups (clouds), and content (examples show some less explicit postings).One observed sequence of joining events is listed as an example using abbreviations  and  for the joining of a "future-ban" or "no-future-ban" group, respectively.(b) Snapshot from VKontakte showing the measurable event of a group or individual getting banned.(c) Exact diagrammatic expansion for the probability (i.e., risk) Π which is the probability that any given individual in our study will eventually develop a sufficiently high level of pro-ISIS support that he/she violates VKontakte's rules against promoting pro-ISIS violence and hence his/her account gets banned.() is the probability that his/her account will eventually get banned and he/she joins exactly  future-ban () groups prior to this banning.Hence by definition, () = 0 for individuals whose accounts do not get banned.The probability propagator for a future-ban () group is shown as a gray strip.The expansion of the terms is shown both in terms of diagrams (top) and the corresponding mathematical expressions (below).(d) Log-log probability (), calculated from the empirical data by counting the fraction of individuals whose account eventually gets banned and who join exactly  future-ban () groups between Jan 1, 2015, and the instant his/her account gets banned (by definition, () = 0 for no-future-ban individuals since they do not get banned).Line with slope −2 is shown as a guide.(e) Same as (d) but over small n range.
Complexity probability Π that a given individual will develop a high level of extremist support (i.e., becomes banned and hence is a future-ban individual) by expanding Π exactly in terms of successively higher-order interactions with  groups (Figure 1(c)).The data (Figure 1(d)) shows that the expansion terms () for becoming banned after joining  ≥ 2 futureban groups are interrelated by an approximate power-law distribution  − with  ∼ 2 as opposed to a memoryless exponential decay.Though reminiscent of power-law distributions reported for everyday human activities, this is the first example related to extremist pathways.It is consistent with the idea that someone who has joined  groups has accumulated  potentially distinct narratives over time and hence needs to make ∼ 2 comparisons between pairs.This ∼ 2 increase suggests a similar increase in both required resources and potential narrative discord as  increases, which in turn suggests that the probability () may vary as ∼ −2 , as observed for  ≥ 2. By contrast, Figure 1(e) shows that (0)-and to a lesser extent (1)-is atypical.In particular, the empirical value of (0) (i.e., the probability that an individual who gets banned will not join any futureban groups between Jan 1, 2105, and the banning of their account) is far smaller than that predicted by this simple power-law scaling.This has the important consequence that a risk estimate (i.e., probability) that a given individual will in the future reach a high level of pro-ISIS support (i.e., account becomes banned) based purely on observing "lone wolf" individuals who subsequently join no  groups (i.e., only using (0)) will be a significant underestimate since the ∼ −2 scaling makes higher-order corrections to (0) large in Figure 1(c).The fact that (1) is by far the largest of all the expansion terms suggests that the impetus that a future-ban individual experiences toward becoming banned after joining his/her first future-ban group is more influential than upon joining his/her second and so on.
To explore these higher-order contributions to Π more rigorously, we condition the probabilities on a particular value of the event-time lifetime  ban , which is the number of pro-ISIS online groups of type  and  that a future-ban individual joins before becoming banned.Figure 2(a) shows ( |  ban ) for representative  ban values.(By definition, ( |  ban ) is undefined for individuals whose accounts do not get banned.)The resulting empirical distributions deviate from the null model of a memoryless binomial process and provide evidence of a specific memory effect in the groupjoining activity of those individuals who will eventually become banned, in particular, for small  ban .This expansion unraveling of contributions to Π-either conditioned to a given  ban or not-allows immediate prediction of all higherorder probability terms for an arbitrarily chosen individual and hence the risk that this individual will reach a high level of pro-ISIS support, based simply on an approximate knowledge of the functional dependence of ().For example, evaluating Δ = (Π − (0)) quantifies the impact online social media has on the probability (i.e., risk) that an individual will become extreme enough to have their account banned: using the numbers from our dataset, joining online groups enhances the probability that an individual eventually reaches a high enough level of pro-ISIS support that his/her account gets banned, from (0) = 1206/91781 = 0.013 to Π = 0.084 which is a huge increase of 546%.Evaluating the expansion for Π analytically using additional knowledge of (1) and applying the scaling factor from (1) to (2) for all  reduce this error from 546% to 68%. ) is primarily determined by .We therefore fix  to be small and estimate  and  for each value of  ban from the empirical data using maximum likelihood estimates.The model agrees well with the data (Figure 2(a)) and confirms that the largest impact of memory effects is for smallest  ban .The fact that the memory effect decreases with an increase in event-time lifetime (Figure 2(a)) is consistent with the notion that individuals that join many groups may have a less certain longer-term goal and hence may act more randomly when choosing their next group type.Any practical surveillance would also be mindful of clock-time.We now show that the existence of memory effects in event-time lifetime does indeed carry over to clocktime.The data shows that event-time lifetimes ( ban ) act as a lower bound (see SI) for the corresponding clock-time lifetimes  ban ; hence there are a spectrum of individualdependent conversion factors between event-time and clocktime.This makes sense because individuals who visit a given number of online groups will likely differ considerably in how much clock-time they spend in and between visits: by contrast, event time just counts the cumulative number of groups joined.Despite this, Figure 2(b) confirms that a strong memory effect indeed arises for individuals with short clocktimes (now  ban ) and that these memory effects can still be modeled using a continuous-time version of our memorydependent model from Figure 2.During any given day, an individual may have no group-joining event; hence the stochastic process mimics a walk in an abstract ideological space as opposed to a group-joining space.We hence consider the simplest case of a memoryless one-dimensional walk in which an individual gradually moves toward, or away from, an absorbing boundary that defines their exit from the system (i.e., account gets banned) and hence determines their clock-time lifetime  ban .As  ban decreases, the empirical data shows an increasing deviation from this memoryless walk result.However, when we add a similar type of memory as before, that is, with probability  the individual takes an action (step) that copies the previous change, we find that the model with memory does now fit the data well (Figure 2(b)).

Modeling Memory in
An immediate practical implication of our findings so far is that strong, observable memory effects occur for the shortest lifetime individuals, whether measured in eventtime or clock-time.This is fortunate since individuals who are moving quickly toward becoming banned (and hence toward showing a high level of pro-ISIS support) are also likely to be of most interest as potential threats.Though they may end up never carrying out a real-world extremist act, their banned status means that they likely harbor the strongest intent.

Preferred Dynamical Transitions.
In accordance with the existence of memory effects in the trajectory of group joining, we find that future-ban individuals' choice of their next group to join depends on the last group that they joined and that there are rather well-defined attractors.The classes of groups are designated according to their size compared to the median group size ("small" or "large") and whether they have a majority of postings that are focused on a news story ("news") or on more abstract discussions (we call these "spiritual," since most of these are indeed spiritual in their content).Though other classifiers are in principle possible, and this choice of classifiers is somewhat subjective, we have checked that only using a subset does not change our main conclusions and instead muddles the behavioral patterns.Figure 3(a) (left) uses the relative frequencies of online group-joining events to generate estimates of the conditional probability that a futureban individual who most recently chose a group of class  (row) will next join a group of class .The SI provides details of the calculation and additional results, including for subsets of attributes.Figure 3(a) (right) shows the corresponding result if these individuals were to choose their next group based purely on the number and size of groups that exist at the time of their choice, that is, akin to preferential attachment as in the coalescence-fragmentation model in [12].
Figure 3(a) shows that future-ban individuals have the greatest preference for next joining a banned, large, news group, with this type of group acting as a global attractor.Although this in part reflects the higher relative number and size of this group class, the differences between Figure 3(a) left and right panels (together with the very high  2 and negligible  values shown) show that the full story behind these individuals' choice-making lies beyond a pure size effect.To explore this further, we renormalize the transition probabilities by the number of groups and members in the future class , yielding the results in Figure 3 (left) shows that future-ban individuals have an additional attraction, beyond preferential attachment, toward either future-ban, small, spiritual groups or no-future-ban, small, news groups.A theoretical model that incorporates "character" into preferential attachment-as presented in [16]-could help tease out these differences, backed up by more detailed knowledge of individuals' circumstances.each individual for which location was reported.With the caveat in mind that this is a preliminary approach, we assign a change in an individual's declared geographical location as a "scattering" event in terms of changing the online trajectory of that individual (Figure 4(c)), for example, because of a change in his/her attitude or circumstances.Taking our current data at face value, the VKontakte data shows (Figures 4(a) and 4(b)) that there are visible patterns in how individuals of the two different types change their declared locations, with future-ban individuals visibly avoiding loops (i.e., less return trips).We can quantify this in a simple way by looking at the fraction of return trips.If  future-ban individuals change country from  to  during our period of study and  individuals change country from  to , we can define a country return rate for this country pair as min(, )/ max(, ).We then average this quantity over all possible pairs of countries to obtain a country return rate .For future-ban individuals (Figure 4(a)), this country return rate  = 21.6% while for no-future-ban individuals (Figure 4(b))  = 44.7%,which is more than twice.Figure 4(c) shows that for future-ban individuals that change country, their higher-order ( ≥ 2) probabilities tend to increase Complexity in the diagrammatic expansion, that is, estimates of the probability of interest Π based solely on (0) will be worse.This suggests an opportunity to extend existing research on real-space conflict and mobility [16][17][18][19][20][21][22] to develop a fuller theory of extremist dynamics in coupled cyber-physical space.Among future-ban individuals, the average number of online pro-ISIS groups joined immediately after a move from Syria to any other country is 0.36 as compared to an average over time of 0.18.This is a statistically significant increase ( = 0.03).Similarly, the average number of online groups that such future-ban individuals join immediately before a move to Germany from any other country is 0.48, as compared to an average over time of 0.36 ( = 0.08).The case of US is similar to Germany ( = 0.1).This means that, among future-ban individuals who come from anywhere in the world, the ones that enter Germany and US tend to do so immediately after a burst of online group-joining activity, a fact that could be used to identify future high-risk individuals in those countries.Among all those who are at some stage in Syria, there are a higher percentage of future-ban individuals who go from Syria to France than go to Syria from France.The same is true for Turkey, Iraq, and Australia.The vast majority of individuals moving from Syria to US or UK are no-futureban, suggesting that the threat in US and UK is a long-term latent one, perhaps like the 2017 Manchester bomber.With more reliable country-specific data, future risk probabilities tailored to these specific locations and movements could be calculated by conditioning the scattering transitions in Figure 4(c).

Conclusions
Our paper addresses the pressing societal problem of understanding which individuals are currently developing intent in the form of strong support for some extremist entity, even if they never end up doing anything in the real world.Using a unique dataset from an online social media source, we have identified specific dynamical patterns in the online trajectories that individuals take toward developing a high level of extremist support, specifically, for ISIS.We identified strong memory effects emerging among individuals whose transition is fastest and hence may become "out of the blue" threats in the real world.A generalization of diagrammatic expansion theory helped quantify these characteristics, including the impact of changes in geographical location, and can now be used to facilitate prediction of future risks.Our findings help move beyond postfact interviews with already convicted terrorists, by focusing on the trajectories that individuals follow with respect to developing and expressing high levels of pro-ISIS support, irrespective of whether they then carry out a real-world attack or not.Our findings also help move global security debates beyond static watch-list identifiers such as ethnic background or immigration status.Given the broad commonality of social media platforms, the results that we report here likely apply more generally; for example, even on Telegram where (like Twitter) there is no built-in group feature, individuals tend to collectively build and pass through so-called super-group accounts [11].
individual joins n'th group at time t future-ban individual joins (n + 1)'th group at time t  > t B→B→B→B→B→B→B→B→B→B→B→B→B→B→B→B→B→B→B→A→B→B→B→B→A→B→B→B→B→B→B→B→B→B→B→B Example: user 1549171532 (male) (a) has been suspended due to calls to violent actions

Figure 2 :
Figure 2: Memory effects for shorter lifetimes.(a) Plot of conditional expansion terms ( |  ban ) given a specific event-time lifetime  ban for individuals who will eventually get banned (i.e., futureban individuals).( |  ban ) = ( ∩  ban )/( ban ) is the probability that any given individual in our study (i.e., out of all individuals) will get banned and that he/she joins exactly  futureban () groups prior to this banning, conditioned on his/her eventtime lifetime until banning being  ban .By definition,  ban and hence ( |  ban ) are undefined for individuals whose accounts do not get banned.Scattered points: empirical values obtained from the set of individual pathways in our dataset.Solid lines: our finite memory model with parameter value  shown and obtained from a maximum likelihood estimate (MLE).Dotted lines: result for a memoryless null model in which individuals have the same average group-joining rate as the data (obtained using MLE), hence yielding a binomial distribution.The largest memory effects arise for the shortest event-time lifetimes.(b) Comparison of the clocktime lifetime  ban distribution from the empirical data (red dots), our mathematical model with memory effects (blue), and without memory effects (green).Error bands are obtained from simulations of the model. ban is undefined for individuals whose accounts do not get banned.The largest memory effect arises for the shortest clock-time lifetimes.
(b).The simulated model for individuals who are eventually banned (Figure 3(b) right) still shows small fluctuations because of the finite time window, but the empirical result (Figure 3(b) left) shows marked differences that are statistically significant ( 2 = 474.3, < 0.00005).

Figure 4 :
Figure4: Influence of geography.(a) Flow of future-ban individuals based on reported geographical locations in online profiles.The flow from country X to country Y is the number of individuals that change location from X to Y and is proportional to the thickness of the line.The line color corresponds to the originating country X.Each country's node is at its geographic centre.Two individuals' trajectories are listed as examples.Although Russia resembles a travel hub for all individual types because VKontakte has a large number of Russian users, many individuals' pathways do not pass through it (e.g., user 1658609132).(b) Same as (a) but for individuals whose accounts are not eventually banned ("no-future-ban" individuals).For future-ban individuals (a), this country return rate  = 21.6% while for no-futureban individuals (b)  = 44.7%,which is more than twice.(c) Scattering between countries is associated with a shift in the dependence of ().Red line is for future-ban individuals who changed country at least once.Black line is for future-ban individuals who never changed country.
Lifetimes.Figure2(a)shows the results of a mathematical model that we introduce that captures these finite memory signatures in the empirical trajectories and helps elucidate their nature.Each step in the model features a stochastic process in event-time in which the individual is assumed to join a group (with probability ) that is of the same type (i.e.,  or ) as the one that they joined in one of the past  joining events.With probability (1 − ) they join a  group, and with probability (1 − )(1 − ) they join  group, where  determines the individual preference of group types.Simulations show that the results are similar for all  as long as  < 0.7 and that the theoretical ( |  ban