Arresting Strategy Based on Dynamic Criminal Networks Changing over Time

We investigate a sequence of dynamic criminal networks on a time series based on the dynamic network analysis (DNA). According to the change of networks’ structure, networks’ variation trend is analyzed to forecast its future structure. Finally, an optimal arresting time and priority list are designed based on our analysis. Better results can be expected than that based on social network analysis (SNA).


Introduction
Since September 11 attacks, focus of criminal analysis has been shifted to find terrorist networks, and a number of results have been obtained [1,2].Although they faced many challenges like the incompleteness, incorrectness, and inconsistency of information, they made use of the data mining and did subgroup detection to discover information.Later, they developed some visualization tools [3] to help observe criminal networks' structure.All of these must be owed to the development and maturity of SNA [4][5][6].SNA was born in the 1930s, and Moreno did much foundational work for it.After that, SNA developed fast especially in the 1960s-1970s.Various metrics were defined, including centrality, clustering coefficient, and density [7][8][9][10].However, most work was still limited on static criminal network analysis.On the other hand, a new study shows that DNA is on its way [11,12].DNA is an extension and evolution of SNA.It is a study combined with multiagent or multimode, full of uncertainty.Meanwhile, it introduces the domain of time into it, which makes it much more complex than SNA.In fact, our research on DNA is just staying in the beginning, and many problems are waiting to be done.For example, one should predict the traits of networks precisely and develop more techniques to visualize the outline of the whole sequence of dynamic networks.To sum up, it should be studied further.
Since the research to find the suspects and predict the crime time is still based on analyzing frequency and content of communications between suspects, which can be easily concealed and disguised by real criminals, we start from another direction, by observing a sequence of criminal networks of different time and making use of hierarch clustering to analyze the structure changes of criminal networks.
Suppose that we have got a database of message traffic records of all the criminal suspects.Of course, the time that the message is sent will be marked.Furthermore, we suppose that semantic network analysis has been finished and all the messages about crime and suspects have been extracted from the original database.So we can skip preparation part and directly focus our attention on researching the structure of criminal networks.
As our first contribution, we introduce DNA to construct a sequence of criminal networks.For the second contribution, we modify the definition of centrality and partition method based on hierarchy clustering to make it serve our model better.As our third contribution, we suggest a method, time series analysis.By this way, we can predict the possible crime time of every suspects.Finally, through analyzing the structural changes of networks, we propose four different processes to help determine the best arresting time; meanwhile, we design an arresting priority list based on the degree of importance of the suspects.

Establishing Dynamic Criminal Networks
Denote the time interval by Δ and divide the database of E-mail traffic into different parts according to their sending time.We regard the sending time of the earliest E-mail as the initial time; all the E-mails with sending time from 0 to Δ will be used to construct the first dynamic criminal networks (0), followed by next networks of (Δ), (2Δ), (3Δ), and so on.The network (Δ) is constructed by the data between time  × Δ and ( + 1) × Δ, which is unrelated to the data of other time.Now, we get a sequence of criminal networks.For each network (Δ), the nodes of the network represent the criminals who send messages during the time between  × Δ and ( + 1) × Δ, and the edges represent the exchange between  and .The weight of the edge is the times of message exchanges between  and .These networks are all weighted undirected graphs.
In order to describe the network (Δ), we first build up a relation distance matrix   (Δ) and relation strength matrix   (Δ).
Here are some notations that we will use as follows: : the index of member nodes; : the total number of nodes; : the length of the shortest path between nodes  and .The shortest path is calculated based on the Floyd algorithm.
Then, we get the relation distance matrix   (Δ): and relation strength matrix These two matrixes contain nearly all the information of the messages traffic of time between  × Δ and ( + 1) × Δ, which is our basis of the following analysis and forecast.

Designing Modified Centrality Standard
According to Freeman's research, there are three popular centrality measures-degree centrality (  ()), betweenness centrality (  ()), and closeness centrality (  ()).They are helpful in analyzing the criminal network.These three measures are used to measure the relational strength of a network.What is more, they can be used to identify key members who play important roles in a network.But it omits the frequency of transmitting messages, which is crucial to reflect the relation strength between different nodes, so we modify the above definitions and give a new definition of centrality measures as follows.
Degree centrality is defined as follows: where  is the total number of nodes in a network and   has been defined in the previous section, which denotes a variable indicating the weighted number of messages between nodes  and  (an edge from node  to node  or from node  to node  can either be counted as a link).Betweenness centrality is defined as follows: where   () equals 1 if the shortest path from the node  to the node  passes through the node ; otherwise,   () is 0. Closeness centrality is defined as follows: where   is the length of the shortest path connecting nodes  and .The shortest path is calculated based on the Floyd algorithm.
The centralities above describe different characters of nodes in a network.
(i) Degree centrality shows the number of nodes' connections, which also reflects connectivity of nodes in a network.Nodes with more connections can be viewed as more important like a leader [13] in a network.
(ii) Betweenness centrality shows the number of shortest paths passing by certain node.It also reveals the dependency of a node from other nodes.Obviously, if a node is dependent on other nodes quite much, the node must be very important for the smooth communication, like a gatekeeper [13].
(iii) Closeness centrality actually measures how far away one node is from other nodes.Apparently, small closeness value of a node reflects its high importance.

Clustering Analysis
Given a criminal network (Δ), for the purpose of exploring the organization and structure of criminals, it is necessary to analyze its clustering or partition.And in the later part, we will find that the change of networks' structure will reveal much more information than expected.Now, the most popular method includes hierarchy clustering [14,15] and blockmodeling.However, always some approaches need us to determine the number of the clusters beforehand, others directly merge different clusters into one whole cluster with strict structure.Inversely, the criminal activities in the real world have become more and more complex and huge, so it is nearly impossible for few leaders to govern the whole organization; thus, emergence of criminal subnetworks is inevitable.Besides, the structure, partition, and even number of these subclusters are unknown before investigation.In this way, effectiveness of hierarchy clustering and blockmodeling may fail.
Here we design an optimal function to avoid the above problems as the following: Here,  is an experimental parameter, which varies from 0 to ∞ and determines the average size of subnetworks.
The bigger the value of  is, the more of sub-networks will generate, but the average size of sub-networks will be smaller.
And the value of  cannot be determined theoretically; it is the result of fitting according to actual cases. represents the number of subclusters.() represents all the possible partitions of  clusters.  represents the th subnetwork's sum of inner edges' weight after the  and () are determined.This optimal function is based on the density of each subnetwork.If the partition (, ()) is determined, then the density is defined as below: () represents the number of nodes of the subnetwork.
From the definition of density, we can find that density increases when the ratio of the inner interaction of subnetwork dominates the whole interaction of the members of the subnetworks, which reveals the information that these members are intensively connected.And we multiply all the subnetworks' densities to get the optimal function: It reflects the overall tight degree of all the subnetworks.The bigger the value of the optimal function ((Δ), , ()) is, the more intensive each subnetwork's inner structure is, and the maxima of the optimal function determines the best clustering method of the network (Δ).
Below is the pseudocode of an algorithm which is designed to find the optimal value of ((Δ), , ()) and the best partition () as follows as: for  from 1 to  do the traversal of all possible  partitions find max ((), ()) and () if ((), ()) <   ← ((), ())  ← () end if end for Remark 1.To find the maxima value of the optimal function, we must calculate the values of all possible partitions, which needs (2  )-space.As the p-cliques do the similar way to how our optimal function runs, we can use the method of p-cliques first as a filtration to choose adequate candidates of possible partitions.After that, we just need to verify which candidate is best by calculating their optimal functions, which will reduce the calculated scale drastically.
Remark 2. This clustering method is based on the designed optimal function; it can determine actual number and structure of subnetworks beforehand perfectly in theory.And the precondition is that  can be gotten precisely through fitting previous known criminal networks.From this way, it is an empirical function.

Analyzing Changes of Networks and Forecasting Future Structure
As usual, the communications between two people are rarely affected by others, and we regard the arbitrary edge's weight as being independent on other edges.So, we extract all edges' weights, respectively, from the sequence of networks and compose them of some new series: (0) ,   (1Δ) ,   (2Δ) ,   (3Δ) ,   (4Δ) , . . .
(Δ) represents the times of communication between  and  at time from Δ to ( + 1)Δ.It also equals the   in networks (Δ).
We regard communication as an event, and   (Δ) is the sum of times the events occurring in a specific time.So, every sequence composes of a time series, and we suggest the methodology in time series analysis [16] to forecast the value   (Δ) of the future.
Beforehand, as different events adjust to different methods, we categorize the criminal event into two kinds.
(1) Crimes which is so serious that we must prevent them from happening in advance: terrorist attacks like 9/11, premeditated kidnaps and homicides, and so forth.
(2) Crimes which have less direct damage and can be monitored for a long term in order to obtain crucial evidence or destroy and arrest the criminal gangs entirely: drag trafficking organization, a group of traitors who steal corporation's accounts and cash, and so forth.
For the first case, it usually goes through a relatively short interval of time from the formation of criminal motive to implementation; therefore, it lacks obvious regularity and shows a drastic trend from preparation to operation.So, we choose the ARIMA model to predict the communication behavior of every two people in the near future.
As to the second case, according to He's investigation [17], the long-term crimes have a certain degree of periodicity and meet the stationarity.We choose the ARMA model to predict long-term crime law.
This part is not the emphasis of what we discuss, and we just give the brief steps of predicting method [18].
(2) Computing the values of ACF and PCF of the given data. ( When  ≪  ≪ , the network of (Δ) is constructed according to prediction.Similar series of relation strength matrixes, each edge's modified centralities, and clustering networks can be gotten in the similar way.
We call these series the expected sequence.

Designing Arresting Strategy
Although criminal gangs will avoid varying their frequency of communication dramatically to avoid the sight of police, we can still determine the possible committing time of crime and the key guys according to the changes of communication clustering structure and modified centrality.We simulate the criminals' communication process and visualize the known and predicted clustering results of the networks.By experiments, the changing trends of some structure and criminal members deserve our attention.

Merging of Two or More Subnetworks.
First of all, we define that if the subnetwork of a network (Δ) retains more than 50% nodes of arbitrary subnetwork of the previous network (( − 1)Δ), then we say that the two subnetworks represent the same subgangs.
As shown in Figure 1, several subnetworks merge into one bigger subnetwork, which contains most nodes of these original subnetworks; we call this integration process.According to simulation, once this process happens, usually a new criminal plan will be implemented in the short term, so the best arresting time will be right before the structure of the new subnetwork stabilizes when all the members are involved and ready.
In this way, we can prevent the crimes from happening, as well, more members involved can be arrested, and more raw evidence can be obtained.
Inversely, when one big subnetwork break up into several smaller subnetworks, we call this division process.It usually means that one crime has just been implemented and members begin evacuating and hiding themselves.Considering this circumstance, we must catch them immediately after we find this process.The later we operate, the fewer criminals we can arrest and the less evidence we can get.

Generation of New Subnetworks.
As shown in Figure 2, sometimes one derivation of one subnetwork does not come along with the disappearances of original subnetworks.It only occupies a small part of nodes of other subnetworks.We call it the derivation process.For the same reason, once this process happens, a new group of criminals begin to gather and plan a new crime, so we must prevent them before the new subnetwork is finally formed.
It also has its reverse process that is called extinction process: one subnetworks disappears with no emergence of new subnetworks.We cope it with the same way of division process.
Remark 3. Structure, differing from the microscopic behavior like several times of communications between limited individuals, is a macroscopic behavior of the whole criminal networks.So structure's change exposes more deeply the essence of the organized criminal behavior, which is beyond the control of criminals themselves.As once the structure is disguised, the following criminal networks' behavior will be distorted inevitably.It is easily to understand.Every member of the criminal gangs, even the leaders, only contacts limited members and information, which, in one way, ensures security by a certain degree of isolation; in another, is caused by distrust between different subgangs.So no one in the gangs can know and determine the whole criminal network.But we can detect and research the whole network by monitoring their communications.
Remark 4. Here, we just list and analyze four simplest possible processes.Other more complex processes like derivation of two or three bigger subnetworks and division into threetier subnetworks or four-tier subnetworks, although can be realized in theorem, are too complicated for actual criminal activities.So, we omit the further discussion.Certainly, other possible processes are worthy to find in the future study.

Determination of Core Members.
In the process of capturing, usually it is hard to catch all the members, as each capturing will invite others' alertness.Therefore, it is an urgent work to find the core members and catch them firstly.Here, we make use of the expected sequences of modified centrality's of every node who is involved in the newly emerging subnetwork during the integration process or derivation process.Through investigating their changing trend, we are hopefully to find those key guys.
(a) If some nodes whose degree centrality shows an increasing trend and reaches the peak among others' in the expected sequences, then according to the above conclusion, they are more possible to be the leaders of this subnetworks.We should catch them firstly.
(b) If some nodes whose betweenness centrality shows an increasing trend and reaches the peak among others' in the expected sequences, then these guys are likely to be the gatekeepers, who make sure the smooth of communication.Catching them will invite others' alertness immediately.So, we should watch them closely and arrest them lastly.
(c) As to other members, we use the method of TOPSIS [19] to determine their importance.
( Beforehand, we define some new notations.
: the vector contains the three centrality measures.
: the ideal value of   represents the one that is sure to be conspirator.
: the ideal value of   represents the one is sure to be nonconspirator.
(): the euclid distance defined represents the relevancy between a suspicious person and the ideal conspiratorial value   .
(): the Euclid distance defined represents the relevancy between a suspicious person and the ideal nonconspiratorial value   .
According to the above analysis, we define a vector of trituple which contains three measures of the form as follows: For the convenience of description,   is called measure vector.It can also be represented in another form as below: where   ( = 1, 2, 3) stands for element  in the vector   .
Three elements are all divided by the maximum value of the values from data of all people we are concerned about.The aim of the division is to make sure that 0 ≤   ≤ 1 ( = 1, 2, 3).According to the definition of three centralities, it is obvious that   will get its optimal value when degree ( 1 ) and betweenness ( 2 ) get to their largest value 1, while closeness ( 3 ) gets to its smallest value 0.
Therefore, the ideal model of a person who is most crucial among the criminal gangs will have his own measure vector   as (1, 1, 0), while the ideal model of a person who plays marginal role will have his own measure vector   as (0, 0, 1).These two ideal vectors will be defined as   and   : ( What we need to do is to calculate the   value of every node.The measure vector   can be used to calculate the "distance" among the nodes.Based on the idea of ranking method from TOPSIS algorithm, we call the "distance" as suspect distance, which is defined as below: This distance is the key to determine the priority list of who are more crucial.A further distance apparently indicates a lower possibility of being a key member.On the other side, if the   () of the arbitrary node  is quite close to 0, then the one represented by the node  must be quite possible a leader or somebody.
Similarly, we can define the Euclid distance of   from the node  to the ideal   as   (), which is called innocence distance.The definition is The suspicious members in the subnetworks will be arranged into an initial priority list.The order of the list is arranged according to the value of   ().Node with smaller   () will be ranked higher in the priority list since   () is the distance from the node to an ideal "conspirator" node.It is also worth mentioning that node with smaller   () value is also with larger   () value, and this demonstrates the correction of our method.Finally, we can determine to catch which one comes first according to this priority list.

An Illustrative Example
Here, we use the data from ICM 2012 problems [20].First, we use the ARMA model offered by program package of software R to forecast the crime networks in the future (see Figure 3).
The first three crime networks are drawn by the given data; the forth one is the result of ARMA model.The indexes of nodes from 1 to 82 represent 82 potential crimes, and the edges between two nodes indicate the two people have been that communicating.
Then, we use the method from clustering analysis and software Pajek to give the result of partition of known crime networks at Δ, 2 * Δ, and 3 * Δ.
As we have got the partition of known crime networks and the prediction of future crime networks, now, we can combine them with time series analysis and clustering analysis to forecast the partition of crime networks at 4 * Δ.
Similarly, we give the 3D picture of Figure 5.
Observing Figures 4 and 6, we can find that the partition of green part has a growing trend with a decreasing acceleration.In Figure 6, the partition of green part has become constant gradually.
According to the implication from Figures 4 and 6, there is a high possibility that the criminals represented by green nodes were planning a new crime and would come to maturity at 4 * Δ, which merits the high attention of the police.

Conclusion
To choose different perspectives to analyze problems, it will usually obtain some new feelings.In order to close the essence of the criminal network, we abandon the common SNA and frequency analysis or density analysis in DNA, and instead, we observe the overall structure's change of networks.
Although it is hard to analyze quantitatively, we can visualize this process and observe it directly.In this way, we exemplify our superiority to touch overall networks' image, which is hard for criminals to disguise.

Figure 1 :
Figure 1: It is a visualization of change of networks' structure in a predicted sequence from Δ to ( + 4)Δ, which shows a simple integration process.Different partitions are distinguished by different colors of nodes.

Figure 2 :
Figure 2: It is a visualization of change of networks' structure in a predicted sequence from Δ to ( + 4)Δ, which shows a simple derivation process.Different partitions are distinguished by different colors of nodes.

Figure 4 :
Figure 4: Left column is the exact result of partition of crime networks at Δ, 2 * Δ, and 3 * Δ; right column shows the visualization results of partition in 3D space.The nodes with the same color mean that they belong to the same partition.

Figure 5 :
Figure 5: The future partition of crime networks at 4 * Δ.
11072059 and by the Natural Science Foundation of Jiangsu Province of China under Grant no.BK2012741.