DEDGCN: Dual Evolving Dynamic Graph Convolutional Network

With the wide application of graph data in many ﬁelds, the research of graph representation learning technology has become the focus of scholars’ attention. Especially, dynamic graph representation learning is an important part of solving the problem of change graph in reality. On the one hand, most dynamic graph representation methods focus either on graph structure changes or node embedding changes, ignoring the internal relationship. On the other hand, most dynamic graph neural networks require learn node embeddings from speciﬁc tasks, resulting in poor universality of node embeddings and cannot be used in unsupervised tasks. Hence, Dual Evolving Dynamic Graph Convolutional Network (DEDGCN) was proposed to solve the above problems. DEDGCN uses the recurrent neural network to push the evolvement of GCN and nodes, from which it can extract the structural features of dynamic graph and learns the stability features of nodes, respectively, forming an adaptive dynamic graph convolution network. DEDGCN can be classiﬁed as unsupervised graph convolutional network. Thus, it is capable of training the unlabeled dynamic graph, it has more extensive application scenarios, and the calculated node embedding has strong generality. We evaluate our proposed method on experimental data in three tasks which are node classiﬁcation, edge classiﬁcation, and link prediction. In the classiﬁcation task, facing the graph with large scale, complex connection relationship, and uncertain change rule, the F1 value of node classiﬁcation task obtained by DEDGCN reaches 77%, and the F1 value of edge classiﬁcation task reaches more than 90%. The results show that DEDGCN is eﬀective in capturing graph features, and the eﬀect of DEDGCN is much higher than other baseline methods, which proves the importance of capturing node stability features in dynamic graph representation learning. At the same time, the ability of DEDGCN in unsupervised tasks is further veriﬁed by using clustering and anomaly detection tasks, which proves that DEDGCN learning network embedding is widely used.


Introduction
ere are many graph data in complex practical systems [1], such as social platforms, financial investment platforms, e-commerce platforms, etc. Because of various shapes and complex connection relationships, the graph is more difficult to represent than other data types of data. is also makes graph representation the key of application graph data. Graph representation is also called graph embedding (GE). At present, mature methods mainly include DeepWalk [2] and Node2vec [3] based on a random walk, GraRep [4] and HOPE [5] based on matrix decomposition, Struc2vec [6] and LINE [7] based on graph structure characteristics, SDNE [8] and DRNE [9] based on neural network, graph neural network [10], etc. ese methods are mostly applied in static graph. In practical application, graph does not remain unchanged but constantly changes over time, for example, making new friends in the relationship network, increasing and decreasing devices in the topology network, and iterating commodities in the e-commerce network. ese graphs have time attributes. Static network representation learning technology cannot capture the time characteristics of dynamic graphs, and cannot learn and update the changes of nodes and the relationship between nodes, which makes the learning results of graph representation lack authenticity and dynamics.
Graph Convolution Networks [11] (GCN) have a simple structure, low complexity, fewer training times, and a good learning effect. As the basis of graph representation, it is applied by many models. GCN cannot capture the change features in dynamic graph, but the Recursive Neural Network (RNN) can extract these features by processing change sequence information with time relation. In other words, the combination of GCN and RNN can effectively process dynamic graph. GCRN [12], RgCNN [13], etc., input the structural features of graph data captured by GCN into RNN, learn the timing relationship of nodes, and obtain the time features. ese methods separate the structural features of graph from the time features, ignoring the internal connection. Moreover, the single GCN model is unable to capture dynamic structural features adaptively. EvolveGCN [14] uses RNN to evolve the parameters of the GCN so that the GCN can adaptively adjust according to the shape of graph at different times to extract the structural features with time characteristics. In addition, the graph shows periodic changes in many cases, such as significant differences in the communication relationship between employees in enterprises on working days and rest days. Employees are closely connected on weekdays and sparsely connected on rest days. However, the essential attributes of employees have not changed, when mapped to the graph representation, although nodes do not appear, they still maintain their original properties and characteristics, which is the stability of nodes. EvolveGCN lacks cognition of node stability and learning of node inherent attributes. e essence of GCN is supervised learning, which requires labeled data and is difficult to apply to tasks such as clustering and community discovery, making the model poor in universality.
rough the above analysis, we summarized the challenges faced by dynamic graph representation learning: the first problem is the stability of nodes. When capturing the dynamically changing graph structure features, the embedding of nodes is entirely influenced by the graph structure. Still, the attributes of nodes themselves determine the nature of nodes, and this nature does not change significantly with time under normal circumstances, which is the stability feature of nodes. Maintaining the stability of nodes in the graph structure with dynamically changing protrusions is a significant challenge. e second problem is the problem of unlabeled graph data training. e node embedding features obtained by supervised learning are suitable for specific tasks, and the universality is poor. When faced with unsupervised tasks such as clustering and anomaly detection, node embedding cannot be directly applied. Given these two problems, in this paper, a dual evolution dynamic graph convolutional neural network, DEDGCN, is proposed. Firstly, the stability characteristics of nodes are learned through the node evolution module. Secondly, the loss function is calculated by using the stability characteristics of nodes, and the unsupervised dynamic graph representation model is constructed while modifying the network architecture. e following are the main contributions of this paper: (i) We propose a dual evolution model. Based on EvolveGCN, a node evolution module is added to form a dual evolution model. While capturing the structural characteristics of a dynamic graph, the stability characteristics of nodes are learned. e GCN network architecture is jointly revised from graph structure and nodes to build an adaptive dynamic graph convolution model. (ii) We build an unsupervised GCN model. We use the node stability of the dynamic graph and the prediction ability of the node evolution module to construct the loss function independent of the data label. Without relying on specific data label and tasks, DEDGCN is an unsupervised dynamic graph convolutional network. We can apply DEDGCN to a broader range of tasks.
Architecture: Section 2 describes the related work, relevant concepts are given in Section 3, solutions to the problems are put forward in Section 4, an experimental evaluation of our methods is presented in Section 5, and Section 6 provides a summary.

Related Work
e purpose of graph representation learning technology is to obtain low-dimensional dense vectors of nodes and apply them to downstream tasks such as node classification, link prediction, network reconfiguration, community discovery, network data visualization, etc. [15]. At present, most graph representation learning methods are mainly aimed at static graph. When faced with dynamic graph data, these methods cannot capture the changing characteristics of the graph, resulting in unsatisfactory results in the application process. e essence of dynamic graph data and static graph is the same; so, many dynamic graph representations often evolve from the static model.
At the earliest stage, node embedding was generated by decomposing the Laplace and adjacency matrix of graph data, called Graph Factorization (GF). e most typical examples are GraRep [4], HOPE [5], etc. ese algorithms use matrix decomposition to calculate eigenvalues and eigenvectors to represent node embedding. e key of graph decomposition technology is to obtain eigenvalues and eigenvectors which directly determine the quality of node embedding. In dynamic graph data, to reduce algorithm complexity, the evolution of eigenvalues and eigenvectors is carried out to update node embedding, the most typical of which is DHPE [16]. In addition, another direct factor that determines the quality of node embedding is the node itself. TIMERS [17] decomposes the adjacency matrix of the initial time graph and updates the node embedding in the subsequent time graph. After each update, the loss value represented by the graph is calculated. When the loss value exceeds the threshold, the time graph is matrix decomposed again. However, with the increase of graph scale, matrix decomposition becomes more and more difficult, and the node representation generated by matrix decomposition is challenging to explain.
is problem makes applying the graph representation learning method based on matrix decomposition to large-scale graph challenging.
Dynamic graph has many structural characteristics, and many scholars use them to reconstruct the generation probability of graphs to calculate node embedding. For example, DyREP [18] used the relational evolution and social evolution in dynamic graph to initially describe the characteristics of nodes, periodic changes of nodes, and the influence of external nodes on embedding from the perspectives of local embedding propagation, self-propagation, and external factor driving. en, DyREP used the probability of node emergence to reconstruct data and calculate node embedding. Zhou et al. [19] adopted the triadic closure process, combined with social isomorphism and temporal smoothness, to construct the loss function and reconstruct the graph to obtain node embedding. ese methods all belong to the transductive method. Whenever new graph data come, it is necessary to retrain the model to obtain node embeddings. e process is high in complexity, long in time, and inefficient.
With the appearance of Graph Neural Network (GNN), graph embedding technology has entered a new stage of development, especially GCN, which realizes the end-to-end learning of graphs of any size and shape. GCN has a good effect in node classification, link prediction, and other tasks, and has the advantages of simple structure, few parameters, strong ability to extract graph features, and suitability for large-scale graph. At present, many graph embedding technologies are based on GCN, learning the timing information of nodes themselves and capturing the time characteristics of graph data. For example, GCRN [12] combines GCN with RNN, captures the timing information of graph through input node embeddings into RNN. Similar ideas include WD-GCN/CD-GCN [20], RgCNN [13], and so on. However, these methods only start from the dynamic changes of the nodes themselves, ignoring the graph structure's constant changes. e learned structure features are fixed and single, and the adaptability to frequently changing graphs is poor. To make the model adapt to the shift of graph, Addgraph [21] uses an attention mechanism to aggregate GCN parameters in the past period to generate current GCN network parameters. EvolveGCN [14] also aims at the ever-changing graph structure problem. With the help of RNN to evolve GCN, a dynamic GCN model is built, which ensures that the model can learn the changing graph structure adaptively. However, the nature of GCN is still a kind of supervised learning, which cannot extract the structure and dynamic features of unlabeled data, leading to the failure of community division, clustering, anomaly detection, and other tasks. e most common method for learning unsupervised graph is the dynamic autoencoder network, which utilizes the symmetry between encoder and decoder to generate highly nonlinear node embedding. For dynamic graph, DynGEM [22] increases the number and width of layers of encoder and decoder according to the graph size each time, which ensures the adaptive change of the model. At the same time, using the first-order approximation and second-order approximation of nodes, the loss values of local structure and global structure of graph are calculated to construct the loss function and train the network. Dyngraph2vec [23] replaces the neurons of the selfencoder with long-term and short-term memory cells (LSTM). It takes the historical neighbor information of the nodes multiple times as input so that the model can capture time characteristics. Dyngraph2vec constructs the loss function in the way of prediction, inputs multiple historical graph into the network, predicts the graph structure of the next time, compares it with the real graph structure, and forms the loss function. e autoencoder network includes multi-layer encoders and decoders, and there are many parameters in the training process, limiting its application in large-scale graph. e dynamic network representation learning technologies involved are summarized in Table 1 according to the learning methods to facilitate further research in the future.

Relevant Concepts
In this section, we formally define the basic concepts and related issues of dynamic graph representation learning.

Definition 2. (Dynamic Graph) A series of static graphs that change continuously constitute a dynamic graph, denoted by
t represents the serial number of snapshots, V t represents the set of nodes under the t snapshot, E t represents the set under the t snapshot, and A t means the adjacency matrix of the graph of the t snapshot.
Generally speaking, we define the problems related to our work as follows.
Problem: . (Dynamic Graph Representation Learning) For a given G � G 1 , G 2 , · · · , G T , the dynamic graph representation learning is defined as mapping nodes in snapshot

Methodology
is section puts forward a dual evolving dynamic graph convolution network, DEDGCN, whose framework is provided in Figure 1. DEDGCN mainly includes GCN Security and Communication Networks evolution and node evolution. e graphs' dynamic structure characteristics and node stability characteristics are learned, respectively, to build an adaptive unsupervised dynamic graph convolutional network.
In the part of GCN evolution, we learn from the evolution process of GCN by EvolveGCN and use RNN to capture the morphological change rule of GCN in historical snapshots to generate the GCN network needed by the current snapshot. GCN parameters are the key to building a GCN model. Using RNN to evolve GCN parameters can quickly correct the GCN to capture changing graph structure features.
In node evolution, we mainly consider the stability characteristics of nodes. e attributes of nodes themselves are stable during the normal development of graph and will not change signi cantly. e change process with time contains a lot of temporal information, which can re ect the uctuation of its attributes. By capturing the temporal information, we can predict the changes of nodes and get the stability characteristics of nodes. Similarly, we use RNN to learn the timing information of nodes, extract the attributes of nodes themselves, and ensure the stable generation of node embedding with time. We modify the GCN by constructing the loss function and feeding back the stability characteristics of nodes to the GCN. e loss function mainly comprises node embedding generated by the GCN and node embedding predicted by RNN. RNN can predict the embedding of nodes at the next moment by learning the timing information of nodes. In vector space, the predicted value should be as close as possible to the generated value of GCN to ensure the authenticity of the learned timing information. In addition, the stability of nodes also makes the embedding of nodes in adjacent snapshots not change signi cantly, and the generated values calculated by GCN should be close to each other. erefore, we use the gap

Method
Learning techniques Supervised Unsupervised DHPE [16] Matrix decomposition, embedded update √ TIMERS [17] Matrix decomposition, embedded update √ DyREP [18] Dynamic network structure characteristics √ DynamicTriad [19] Dynamic network structure characteristics √ GCRN [12] Splicing GCN and RNN √ WD-GCN/CD-GCN [20] Splicing GCN and RNN √ RgCNN [13] Splicing GCN and RNN √ Addgraph [21] Attentional mechanism evolves GCN √ EvolveGCN [14] RNN evolves GCN √ DynGEM [22] Scalable autoencoder network √ Dyngraph2vec [23] LSTM evolves autoencoder network √ between the predicted value and the generated value of GCN and the distance of node embedding between adjacent snapshots to construct the loss function in order to provide constant feedback and correct the GCN. Ensure that the node embedding generated by the GCN can have the node stability characteristics in the predicted value. At the same time, the loss function of DEDGCN does not depend on data labels, forming an unsupervised dynamic graph convolution network. e number of nodes in each moment graph changes with time in the dynamic graph, and they are not precisely the same. To describe snapshots conveniently, we use N to indicate the number of nodes in each snapshot.

GCN Evolution.
We use the GCN to extract the structural features of graph. GCN contains multiple graph convolution layers, which can aggregate multi-layer neighbor features and capture structure features. For G t (V t , E t ), the propagation rules among the graph convolution layers are as follows: Here, H (l) t represents the calculation result of the snapshot t after the l-th graph convolution layer. A t is de ned as the regularized form of the adjacency matrix A t of the snapshot t. A t ′ is the self-connection matrix of snapshot t. D is the degree matrix, D ii j A ij ′ , and W (l) t represents the parameters of the l-th graph convolution layer at t. σ(·) refers to the activation function, such as ReLU, sigmoid, etc.
To enable GCN to capture the structural features of dynamic graphs adaptively, we use EvolveGCN as a reference for evolving the parameters of GCN, and modify the architecture of GCN to adaptively obtain to the changing graph. At present, the commonly RNN cell is Long-Short Term Memory [24] (called LSTM), which consists of an input gate, a forget gate, and an output gate, which can selectively capture the information of time series, save the key and forget the redundant content. For the parameter evolution of GCN, we choose LSTM as the memory cell, and the process is shown in Figure 2. We input the GCN parameter W (l) t−1 of l-th layer at t − 1 into LSTM, and get GCN parameter W (l) t of l -th layer at t. e calculation method of GCN parameter evolution is as follows: In the process of GCN parameter evolution, LSTM constantly learns the changing rules of GCN parameters in continuous snapshots. Memory cells are updated continuously, and new GCN parameters are continually predicted so that the GCN can adaptively modify the shelf shape with the change of graph to capture the dynamic structural features e ectively.

Node
Evolution. LSTM has an excellent ability to capture timing information. We use LSTM to learn the behavior of nodes in continuous snapshots, extract the stability characteristics of nodes, and predict the state of nodes in the next snapshot. Here, we use LSTM to evolve nodes, and the evolution process of nodes is shown in Figure 3.
We input the embedded X t−w+1 , X t−w+2 , · · · , X t calculated by GCN in continuous snapshots into LSTM and get predicted embedding P t+1 at time t + 1. w represents the window size of LSTM and P t indicates node embedding at the predicted time t. e calculation method is shown in the following formula.
By learning the node state during the window w, the predicted node embedding includes the attribute characteristics of the node itself and the stability characteristics in the changing process. We spread the attributes of nodes into the GCN in the form of the loss function and further modi ed the GCN network structure.

Construction of Loss Function.
We use the node embedding generated by GCN and the node evolution results to construct the loss function and modify the GCN model. Let us go into the details below.
In the normal changes of dynamic networks, the characteristics of nodes will not change drastically. In dynamic graph Security and Communication Networks networks with frequent increase or decrease of nodes, the inherent characteristics of nodes will not change enormously, which denotes the stability of the nodes. e property of nodes transformed into vector space shows that the embedding similarity between adjacent snapshots is higher. erefore, we can express it by calculating the embedding similarity of adjacent snapshot nodes, and the calculation method is shown in the following formula:.
In formula (4), x i t indicates the node embedding generated by GCN by node i at time t, and similarity represents cosine similarity function.
In the process of node evolution, LSTM is used to predict node embedding, and the expected embedding value of the node at the next moment is obtained. When the predicted value is consistent with the generated value of GCN, the GCN model can acquire the ability to extract the stability features of nodes. e same is true for the ability to extract the structural characteristics of nodes. We calculate the similarity between the predicted value and the GCN-generated value to keep their consistency. e calculation method is as shown in formula (5), where p i t represents the node embedding expected by LSTM for node i at time t.
In addition, to prevent the over tting phenomenon in the training process, we add the weight attenuation function to the loss function to reduce the parameter weight as shown in formula (6), where w represents the GCN parameter.
e nal loss function is de ned as follows:

Dataset.
We experiment with our method DEDGCN on publicly available basic datasets. e datasets are described below.
Elliptic [25] is a bitcoin transaction graph in which nodes represent transactions and edges represent bitcoin ows between transactions. e types of nodes in this graph can be divided into two categories: legal and illegal, so we use this dataset for experiments of node classi cation and node clustering tasks.
Bitcoin alpha is a platform for trading in bitcoin, which uses the trading behaviors among users to form a graph. Members of the platform rate other members on a scale of F02D 10 to +10, which indicates the trust level of each member. According to the user's trust score, we divide the score into two categories: trustworthy and untrustworthy. e transaction behaviors among users are built into a dynamic network. e nodes represent users, and edges represent transactions between users. In scoring each user's transaction process, users are given the label of trust or distrust. For Bitcoin alpha, we carry out the edge classication task and link prediction task.
Reddit Hyperlink is a graph composed by extracting the link relationships in Reddit posts. Each link relationship contains the time and the source post's emotion (positive or negative) to the target post. We performed edge classi cation on this dataset to predict the emotional relationships that existed between unlinked posts.
UCI is a graph composed of private information sent by UC-Irvine on the campus social platforms. Users can search for other people on this social platform and then send conversations according to the pro le information. e edge represents a piece of private information sent by user u to user v at time t. We predict the links of this dataset to capture the possible contacts of users at the next moment.
AS network is a graph composed of connections between autonomous systems. Nodes represent autonomous systems, while edges represent connectivity between autonomous systems. We make the link prediction task on this graph to predict the connection relationship of the network at the next moment.
Myanmar network refers to the network composed of autonomous systems applied by Myanmar. Nodes represent autonomous systems belonging to Myanmar, and edges represent the connectivity relationship between autonomous systems. rough the continuous monitoring of the network in the Myanmar autonomous system, the abnormal behavior of the network can be warned, and a reference can be provided for maintaining the regular operation of the network.   e basic information of these data is shown in Table 2. According to the characteristics of different datasets, we divide them into snapshots at different time intervals and apply them to different tasks.

Baseline.
We will compare DEDGCN with the following five basic methods.
Methods 1: GCN is a static graph convolution neural network, which cannot extract dynamic features. We fuse all snapshots to form static graph, and compare the importance of time features. At the same time, GCN is a network structure feature extractor in DEDGCN, which is tested as a module unit of DEDGCN.
Method 2: GCN-GRU is a method of generating node embedding with fixed GCN and evolving node timing relationship with GRU. is method can be regarded as a node embedding module and tested as a unit module of DEDGCN.
Method 3: DynGEM is a dynamic unsupervised network representation learning method based on deep autoencoder model. DynGEM carries out adaptive learning for graph at different times through the evolution of the parameters of the deep autoencoder model, so as to ensure the learning of network structural features and spontaneously capture the relationship characteristics between graph at different times, so as to make the network embedding results at different times continuous.
Method 4: Dyngraph2vec is a dynamic unsupervised node representation method that integrates the deep autoencoder model, LSTM, and MLP networks. Neurons in the deep autoencoder model are replaced by short and long memory cells to construct a deep autocoding model with memory function. According to the different combination modes of the three networks, dyngraph2vec includes three versions, namely, dyngraph2vecAE, dyngraph2vecRNN, and dyngraph2vecAERNN. Dynagraph2vecAE and DynGEM are similar in architecture, so the second and third methods are used for comparison. Method 5: EvolveGCN is similar to GCN-GRU, which uses RNN to evolve GCN parameters and learn the time series relationship of models. is method is regarded as a GCN parameter evolution module and tested as a DEDGCN unit module.

Metric.
e evaluation methods used in this paper include F1 value and MAP. e calculation of these two evaluation methods is introduced in detail below.
First of all, we need to make the following definitions: rough the above four definitions, we can calculate the precision and recall of the prediction results, and the formula is as follows: F1 value is the harmonic average of accuracy rate and recall rate, which can objectively reflect the validity of prediction. e calculation formula is as follows: AP refers to the integral of the PR (precision-recall) curve, the average precision of all recall values between 0 and 1. e formula is as follows: MAP refers to the average of all kinds of AP, and the calculation formula is as follows. K indicates the number of categories of AP.

Task.
In this paper, we prove the effectiveness of our proposed DEDGCN through four tasks: node classification, edge classification, link prediction, and anomaly detection.

Node Classification.
Predict the types of unlabeled nodes by learning the characteristics of labeled nodes. In this section, the probability of node embedding is calculated by the feedforward neural network and softmax function to judge the node type u at time t. For node classification, we use the F1 value to measure the effectiveness of the method.

Edge Classification.
Predict the types of unlabeled edges by learning the features of labeled edges. In this paper, the probability of edge embedding is calculated by feedforward neural network and softmax function to judge the edge type at time t. e embedding is obtained by aggregating node u and node v representation. e aggregation method adopts the Hadamard product. e measurement method of edge classification adopts the F1 value.

Link Prediction.
Whether the edge at time t + 1 exists or not is predicted by embedding of node u and node v before time t + 1. We aggregate node u and node v, and then use MLP to obtain the existence probability of edges. For link prediction, we use MAP to measure the effectiveness of the results.

Anomaly Detection.
rough the representation of node embedding in the continuous snapshot, the normal state of the node at the next moment is predicted, and the expected value is compared with the actual value of the node to judge whether the node is abnormal.

Details
(1) For any dataset, we use a one-hot node-degree as the input feature of the model.  Figure 4. As we can see from Figure 4, DEDGCN performs node classi cation tasks in Elliptic dataset, and its F1 value is much higher than that of GCN, GCN-GRU, and Evlo-veGCN. GCN is not e ective in the dynamic graph representation learning, and the F1 value is only 47% in node classi cation tasks, which shows that the graph structure changes have a great impact on the generation of node embedding in the dynamic evolution process. e lack of time leads to a lot of old information in the graph, much noise in structural features, and poor e ect in practical tasks of node embedding application. However, the two versions of EvolveGCN, version H and version O, are not e ective in node classi cation of Elliptic dataset, especially in version H, the F1 value of node classi cation is 44%. To sum up, the attributes of nodes on Elliptic are relatively stable, and it is di cult to convert between legal nodes and illegal nodes. GCN-GRU model and DEDGCN model proposed in this section can evolve nodes, extract inherent features of nodes, and maintain the stability of nodes. F1 values are higher than EvolveGCN, which shows the importance of node stability features in the process of snapshot evolution. In addition, the F1 value of DEDGCN using MLP for supervised classication reaches 77%. Besides supervised classi cation, K-means clustering algorithm is used for unsupervised classi cation of node types, and the e ect is only 1% lower than that of supervised method.
is also proves that DEDGCN learning node embedding is suitable for unsupervised clustering tasks.

Edge Classi cation.
We apply Bitcoin Alpha and Reddit Hyperlink Network datasets to edge classi cation task. For GCN parameters and node embedding evolution, we both adopt a time window size of 5. At the same time, GCN, GCN-GRU, EvloveGCN, and DEDGCN constitute unit test experiments to verify the e ectiveness of di erent modules of DEDGCN in the edge classi cation task, respectively. e experimental results are shown in Figure 5.
As shown in Figure 5, in the edge classi cation task, the F1 value of DEDGCN classi cation reaches 93% on Bitcoin Alpha and 90% on Reddit Hyperlink Network dataset, and the classi cation e ect is far better than that of module unit classi cation. e F1 value of EvolveGCN for edge classication is higher than that of GCN and GCN-GRU. In dynamic graph, nodes rely on edges to form a network, and edges are an important component of the graph, and the characteristics of edges depend on the network structure. In EvolveGCN and DEDGCN, relying on GCN parameter evolution module, dynamic network structure characteristics can be captured, while GCN and GCN-GRU can only  Tasks  Elliptic  230769  234355  49  Node classi cation and clustering  Bitcoin alpha  3783  24186  136  Link prediction, edge classi cation  Reddit hyperlink network  55863  858490  174  Edge classi cation  UCI  1899  59835  192  Link prediction  AS network  6474  13895  100  Link prediction  Myanmar  209  48857  75 Anomaly detection extract xed network structure characteristics, but have no updating ability. e learned network embedding contains a lot of old information, which cannot re ect the current state of the graph, and has poor e ect in edge classi cation tasks.
Combining the experimental results of node classi cation and edge classi cation tasks, DEDGCN has di erent advantages in di erent tasks. In node classi cation task, the stability feature captured by node embedding evolution module helps to improve the node classi cation e ect. In edge classi cation task, the dynamic network structure feature captured by GCN parameter evolution module helps to improve the edge classi cation e ect. Compared with GCN, GCN-GRU, and EvolveGCN, the F1 value of DEDGCN is higher than that of the unit module, which shows the importance of GCN parameter evolution module and node embedding evolution module in DEDGCN. DEDGCN is an integral part of dynamic graph representation learning.

Link Prediction.
We conducted link prediction experiments on Bitcoin Alpha, UCI, and AS Network datasets, respectively, and the results are shown in Table 2.
From Table 3, we can see that the MAP values of DEDGCN are higher than those of other methods, and remain above 0.15, which has a good e ect on the link prediction task. MAP value is the comprehensive embodiment of accuracy and recall rate and can re ect the global performance index of the algorithm. Bitcoin Alpha, UCI, and AS Network belong to three di erent types of graph data: transaction network, social network, and device network. DEDGCN has stable MAP values on di erent graph types, which also directly shows that DEDGCN can be applied to most graph data and has better robustness.

Anomaly Detection.
We use the global BGP routing path information (including the AS relationship to which BGP belongs) collected by route views to build a global dynamic AS relationship network in two hours. Due to the huge number of AS globally, we selected Myanmar's AS network as the research object to verify DEDGCN's ability in anomaly detection tasks.
First of all, we learn the evolution law of the normal AS network through DEDGCN. en, the AS network to be detected is input into the trained model to obtain the node embedding and node evolution results of the AS network. Finally, the similarity of node embedding and node evolution results is calculated to determine when the network anomaly occurs.
Here, we use the cosine similarity method to calculate the similarity between each node's embedding and evolution result.
is similarity is also called the normal value in anomaly detection. To determine whether anomalies occur at time t, we take the average similarity of all nodes in the AS network at time t as the normal value of the network at time t. In the process of continuous monitoring, when the network is abnormal, the normal value of the network will appear to "steep drop." At this time, we can judge the time when the AS network is abnormal.
Taking Myanmar as an example, we collected Myanmar's AS network from January to February 2021, divided snapshots in two-hour units, and used DEDGCN for monitoring. e monitoring results are shown in Figure 6.
It can be clearly seen from Figure 6 that from 0 : 00 on February 1, 2021, the score of Myanmar's AS network has experienced a "steep drop," indicating that there is a problem with the Myanmar network at this time. rough Wikipedia veri cation, it can be known on February 1, 2021, the Myanmar military took over the government. To complete social control and control speech, it cut o the Internet and communications in major cities, resulting in large-scale network outages across the country. is also veri ed DEDGCN's ability in anomaly detection tasks. Security and Communication Networks 9

Conclusions
rough the performance of node classi cation, edge classi cation, link prediction, and anomaly detection tasks, the e ect of our proposed method DEDGCN has been proved. Based on the experimental results, we discussed the applicable task scenarios of GCN-RNN based on node evolution and EvloveGCN based on parameter evolution. DEDGCN, which combines the advantages of node evolution and parameter evolution, makes the extracted node features dynamic while retaining the inherent stability characteristics of the node and has achieved good results in various tasks, which proves the broad application of the DEDGCN method. In addition, DEDGCN is an unsupervised graph representation model. We use unsupervised clustering methods to determine the type of nodes in the node classi cation task for experimental data. Its e ect is only 1% lower than the e ect of supervised classi cation. In the network anomaly detection task for realistic graph data, DEDGCN can perceive the time when the network anomaly occurs, which also shows the e ectiveness of DEDGCN in unsupervised tasks.  Figure 6: Monitoring of the AS network in Myanmar.