Health monitoring and fault diagnosis of a high-speed train is an important research area in guaranteeing the safe and long-term operation of the high-speed railway. For a multichannel health monitoring system, a major technical challenge is to extract information from different channels with divergence patterns as a result of distinct types and layout of sensors. To this end, this paper proposes a novel group convolutional network based on synchrony information. The proposed method is able to gather signals with similar patterns and process these channels with specific groups of neurons while simultaneously assigning signals with significant difference to different groups. In this approach, the feature can be extracted more effectively and the performance can be improved, owing to the sharing of filters for similar patterns. The effectiveness of the method is validated on high-speed train fault dataset. Experiments show that the proposed model performs better than normal convolutions and normal group convolutions on this task, which achieves an accuracy of 98.27% (
As a rapid developing modern transportation method in the world, the high-speed train has the advantages of fast, efficient, and environment friendly. With the high speed, high density, and cross-regional development of high-speed trains, the safe operation and maintenance of high-speed trains has become the first concern of the field [
The bogie units of the high-speed train.
The condition monitoring and fault diagnosis for critical components are an urgent need for high-speed trains. However, the number of sensor channels on the train body for vibration monitoring signals could be considerable. There are also couplings between sensor signals due to wheel-rail contact force, friction, and interaction between suspension systems [
Deep neural network, as an efficient machine learning model, has achieved great success in many fields such as computer vision, natural language processing, and automatic driving [
Correlation analysis (also known as synchrony analysis in the signal process) is a method to evaluate the relationship between two or more variables. It can be used to explore potential similarities of signal patterns. There have been some efforts to apply correlation analysis in monitoring signal analysis. Liu et al. [
This paper proposes a fault diagnosis structure for the multichannel monitoring system to address the issues brought by the divergences and convergences of different sensors in signal patterns. We combine group convolutions with synchrony information to improve the ability to process multichannel signals with group disparities. The proposed method is able to collect signals with similar patterns and process them with specific groups of neurons, which could extract features more effectively. On the contrary, signals with significant differences would be assigned to different groups of neurons, instead of sharing the same neurons, which makes the model much easier to optimize. The contributions of this paper are as follows: Three synchrony measurements (instantaneous phase synchrony, amplitude envelope synchrony, and composite synchrony) are introduced to estimate the similarity between vibration signals, which provide a scalable and flexible approach to measure the synchronization of multichannel signals. A synchrony group convolutional network is proposed for the signal pattern analysis and feature extraction of the multichannel monitoring system. The proposed structure can process multichannel signals with strong coupling and complex group disparities. The proposed method is applied to fault diagnosis of a high-speed train bogie. The results show that our scheme can achieve high accuracy of fault classification and reduce the model size and burden of computation, which provides a feasible and practical structure for fault diagnosis of the multichannel monitoring system.
The remainder of this paper is organized as follows: Section
In this section, the principles of synchrony analysis and hierarchical clustering for similarity-based grouping of sensor channels are presented. Furthermore, group convolutions are introduced, which is a variant of convolution layers in neural networks.
Signal synchrony [
Let
In order to capture the synchrony between signals, the Pearson correlation coefficient [
The Pearson correlation coefficient is a global synchrony measurement that reduces the relationship between two signals to a single value. However, it is equally essential to analyse the local synchrony in synchrony analysis. One way to obtain the time-resolved Pearson correlation is to calculate the coefficient with the sliding window, which partitions the time-domain input signal into several disjointed or overlapped blocks by multiplying the signal with a window function until the entire signal is covered. For local synchrony of the instantaneous phase, there is another measurement for signal synchrony. Instantaneous phase synchrony analysis is a method for deriving time-resolved connectivity analysis of signals, which has been applied in functional connectivity analysis [
Assuming that signals satisfy Bedrosian’s theorem [
The sinusoid accounts for phase wrapping and ambiguity in the sign of phases over time. Calculation of equation (
In this paper, the global level synchrony of the phase synchrony and the amplitude envelope between each combination is quantified by the Pearson correlation coefficient and average phase coherence for all time steps, respectively. Besides, the composite synchrony among channels is calculated through the weighted average of amplitude envelope synchrony and instantaneous phase synchrony. This average approach and its relationship with hierarchical clustering are explained in the following section.
The hierarchical clustering [
Hierarchical clustering.
The basic scheme of the agglomerative hierarchical clustering is shown as follows: Clusters initialization: each data point is considered as an individual cluster Clusters distance (similarity) calculation: all pairwise distances are calculated according to the similarity, which results in a distance matrix Combine the foremost closed two clusters Update the distance matrix to replicate the pairwise distance between the new cluster and the original clusters Repeat Steps 3 and 4 until only a single cluster remains
Agglomerative hierarchical clustering relies on constructing a distance matrix between all of the clusters. A critical operation is to determine the distance of two clusters. There are various approaches which are used to calculate the distance between two clusters. The most common ways include single linkage, complete linkage, and average linkage [
There are two main advantages of hierarchical clustering for our purpose. Firstly, it makes it possible to obtain clusters by precomputed pairwise distance matrix, which is compatible with the measure of pairwise synchrony and ease of handling of any types of similarity or distance. Secondly, the hierarchical clustering can embed flexibility concerning the extent of granularity. A sensor channel may belong to different clusters under different viewpoints. For example, vibration sensors on the vehicle can be divided into displacement sensors and accelerometers based on measured values. Alternatively, these sensors can also be grouped by the measured directions, namely, lateral direction, vertical direction, and longitudinal direction. The desirable hierarchy and number of clusters could be obtained by “cutting” the dendrogram at the proper level, which makes it convenient to adjust the number of clusters according to requirements. Thus, the agglomerative hierarchical clustering can be applied to obtain the groups of sensor channels from signal synchronicity.
Group convolutions are a variant of convolution layer in neural networks where the channels of the input feature map are grouped, and the convolution operation is performed independently for each channel group. Usually, convolution filters are applied on an input layer by layer to get the final output feature maps. Instead of applying all filters on all channels, group convolutions use different sets of convolution filter groups on the same input so that there is more than one pathway for convolutions on a single input (Figure
Normal convolution and group convolution.
The main advantages of group convolution are as follows [ Reducing model size: The group convolutions can decrease the model parameters by filter grouping. Moreover, for a convolution layer with Efficient training process: By the group approach, the convolution operations are divided into multiple independent parts and can be handled by different processors in a parallel way. The parallel execution can reduce the training time and improve scalability of the model. Better feature representation: Group convolutions can also provide a better model than normal convolutions. Through the filter group structure, feature maps are forced into a dense block diagonal structure, and filters with strong mutual information are grouped adjacent to each other.
In this paper, instead of dividing channels into equal groups sequentially, the groups of sensor channels are inferred based on the signal correlation from amplitude envelope synchrony and phase synchrony.
This paper focuses on improving the ability of the neural network in processing multichannel signals with group disparities. Because of different settings of monitoring sensors in types and locations, there would be divergences and convergences among signals in different channels. The diversity of channel signals would grow with increase in the channel number, which requires more effective neural network model for feature extraction.
In order to address this issue, we propose synchrony group convolutions to construct a fault diagnosis scheme for a high-speed train bogie. We combine group convolutions with synchrony information to improve the ability to process multichannel signals with group disparities. A complete description of the proposed structure for fault diagnosis is schematically represented in Figure Synchrony calculation: the pairwise phase synchrony and amplitude envelope synchrony between sensor channels are calculated based on analytic components and correlation measurement. The composite synchrony between signals is obtained through a weighted average of these two correlation metrics derived from the Pearson correlation coefficient and the average phase coherence, respectively. Channel clustering based on synchrony: after obtaining the synchrony information between channels, we use the pairwise correlations to construct a correlation matrix. Then, agglomerative hierarchical clustering is employed to cluster channels with strong correlations. It should be noted that the group number should be adjusted taking into account not only clustering results but the complexity of the neural network. Synchrony group convolutional network training: in the fault diagnosis model, channel groups in group convolution layers are based on the cluster results of hierarchical clustering. Synchrony group convolutions are configured in the forepart of the network. The filters in each group keep independent in these layers. At a more specific level, the output feature maps for a certain group would still be in one group in the next convolution layer, which could extract much deeper features for a channel group with synchrony signals. After all synchrony group convolution layers, standard convolution layers are attached for feature fusion.
The proposed framework based on synchrony group convolutions for fault diagnosis of the high-speed train.
The proposed structure is an entire end-to-end neural network that all synchrony group convolution layers can be freely adjusted. It should be added that the proposed synchrony group convolutions are compatible with existing deep structures of neural networks. Synchrony group convolution layers can be plugged into the forepart of established structures, which can improve the capabilities of the neural network model for handling multisensor signals. The main advantages of the proposed scheme are concluded as follows: Correlated pattern assembly: The synchrony group convolutions can discover and extract the latent features for signals with similar patterns, which improve the performance and capacity of the neural network for multichannel signals. Scalable and adjustable: The synchrony group convolution layers can be configured on the basis of actual requirements of signal channels. The depth and width of these layers are fully controllable. The group’s number can also be adjusted through proper hierarchy choice. Fully learnable structure: The entirely data-driven scheme reduces the requirement for prior knowledge and needs few human interventions, which also reduces the biases caused by limited domain knowledge.
The data for experiments is from the simulation platform of the high-speed train bogie, which is developed by the State Key Laboratory of Traction Power at Southwest Jiaotong University [
The monitoring sensors measure the motion characteristics, including lateral, longitudinal, and vertical vibration accelerations of different parts (Figure
The structure of the high-speed train bogie and the location of sensors.
Channel descriptions of monitoring sensors on the vehicle.
Channel number | Description |
---|---|
1 | Lateral acceleration of the vehicle front part |
2 | Lateral acceleration of the vehicle rear part |
3 | Lateral acceleration of the vehicle middle part |
4 | Vertical acceleration of the vehicle middle part |
5 | Vertical acceleration of the vehicle front part |
6 | Vertical acceleration of the vehicle rear part |
7 | Lateral acceleration of the bogie 1 in position 1 |
8 | Vertical acceleration of the bogie 1 in position 1 |
9 | Lateral acceleration of the bogie 1 in position 4 |
10 | Vertical acceleration of the bogie 1 in position 4 |
11 | Lateral acceleration of the bogie 1 in the middle |
12 | Vertical acceleration of the bogie 1 in the middle |
13 | Lateral acceleration of the bogie 2 in position 5 |
14 | Vertical acceleration of the bogie 2 in position 5 |
15 | Lateral acceleration of the bogie 2 in position 8 |
16 | Vertical acceleration of the bogie 2 in position 8 |
17 | Lateral acceleration of the bogie 2 in the middle |
18 | Vertical acceleration of the bogie 2 in the middle |
19 | Longitudinal acceleration of the axle box 1 |
20 | Lateral acceleration of the axle box 1 |
21 | Vertical acceleration of the axle box 1 |
22 | Longitudinal acceleration of the axle box 2 |
23 | Lateral acceleration of the axle box 2 |
24 | Vertical acceleration of the axle box 2 |
25 | Longitudinal acceleration of the axle box 3 |
26 | Lateral acceleration of the axle box 3 |
27 | Vertical acceleration of the axle box 3 |
28 | Longitudinal acceleration of the axle box 4 |
29 | Lateral acceleration of the axle box 4 |
30 | Vertical acceleration of the axle box 4 |
31 | Longitudinal displacement of the vehicle front part |
32 | Vertical displacement of the vehicle front part |
33 | Longitudinal displacement of the vehicle middle part |
34 | Vertical displacement of the vehicle middle part |
35 | Longitudinal displacement of the vehicle rear part |
36 | Vertical displacement of the vehicle rear part |
37 | Lateral displacement of the bogie 1 in position 1 |
38 | Vertical displacement of the bogie 1 in position 1 |
39 | Lateral displacement of the bogie 1 in position 4 |
40 | Vertical displacement of the bogie 1 in position 4 |
41 | Lateral displacement of the bogie 1 in the middle |
42 | Vertical displacement of the bogie 1 in the middle |
43 | Lateral displacement of the bogie 2 in position 5 |
44 | Vertical displacement of the bogie 2 in position 5 |
45 | Lateral displacement of the bogie 2 in position 8 |
46 | Vertical displacement of the bogie 2 in position 8 |
47 | Lateral displacement of the bogie 2 in the middle |
48 | Vertical displacement of the bogie 2 in the middle |
49 | Lateral displacement of the wheelset 1 |
50 | Lateral displacement of the wheelset 2 |
51 | Lateral displacement of the wheelset 3 |
52 | Lateral displacement of the wheelset 4 |
53 | Relative displacement of the primary suspension in position 1 |
54 | Relative displacement of the primary suspension in position 8 |
55 | Relative displacement of the secondary suspension in position 1 |
56 | Relative displacement of the secondary suspension in position 4 |
57 | Relative displacement of the yaw damper in position 1 |
58 | Relative displacement of the yaw damper in position 8 |
In order to explore the correlation between channels, the pairwise synchrony for all combinations of 58 sensor channels is calculated, both the instantaneous phase and the amplitude envelope. Here, two examples of the channel synchrony calculation are presented, which are local synchrony and global synchrony in the instantaneous phase and the amplitude envelope. The results of synchrony analysis are illustrated in Figure
Synchrony analysis based on instantaneous phase and amplitude envelope. (a) The lateral acceleration of the bogie 1 in the middle (channel 11) and the lateral acceleration of the axle box 1 (channel 20). (b) The lateral acceleration (channel 11) and the vertical acceleration (channel 12) of the bogie 1 in the middle.
The first example is channel 11 and channel 20 which is shown in Figure
Through calculating phase synchrony and amplitude envelope synchrony of all combination of 58 sensor channels, the correlation matrix is available. According to the correlation matrix, clusters of sensor channels can be obtained through the hierarchical clustering, which is shown in Figure
Hierarchically clustered synchrony maps for sensor channels. Hierarchical cluster map based on (a) the amplitude envelope synchrony, (b) the instantaneous phase synchrony, and (c) the composite synchrony.
Our experiments compare the proposed synchrony group convolutions with the normal convolution and normal group convolutions (GC) to validate the performance of the proposed scheme. We also compare the performance of synchrony group convolutions based on three measurements, namely, amplitude envelope synchrony (ASGC), instantaneous phase synchrony (PSGC), and composite synchrony (CSGC).
As mentioned before, the monitoring data come from multibody dynamics simulations under activation of the track spectrum. The data samples for experiments are processed by the sliding window with 243 points width. For data in the training set, the window is slid by 20 points. For data in the valid set and the test set, the window is slid by 243 points, which is equal to window length and no overlapping. There are a total of 17980 samples which contain 3100 training samples, 7440 validation, and 7440 testing samples, and each class at each speed has an equal number of samples.
In addition, for proposed synchrony group convolutions and normal group convolutions, we compare the performance under 4 groups, 8 groups, and 12 groups, respectively, to test the model performance under various conditions. For the sake of comparison, only synchrony group convolution layers are replaced with normal convolution layers in corresponding comparative experiments, the structure and parameters being maintained. We conduct ten trials with random split training sets to obtain the average accuracy and standard deviation. The experiment results are reported in Table
The experiment results of bogie fault classification.
ID. | Method |
Model size (parameters) | Accuracy (%) | Precision (%) | Recall (%) | F1 score (%) |
---|---|---|---|---|---|---|
1 | Normal convolution | 1362391 | 73.88 ( |
71.98 ( |
73.88 ( |
72.10 ( |
2 | GC-4 | 596501 | 87.87 ( |
87.78 ( |
87.87 ( |
87.69 ( |
3 | GC-8 | 377290 | 69.60 ( |
68.34 ( |
69.60 ( |
68.46 ( |
4 | GC-12 | 308632 | 65.26 ( |
63.20 ( |
65.26 ( |
63.53 ( |
5 | ASGC-4 | 502135 | 97.79 ( |
97.82 ( |
97.79 ( |
97.79 ( |
6 | ASGC-8 | 365719 | 91.52 ( |
91.57 ( |
91.52 ( |
91.45 ( |
7 | ASGC-12 |
|
|
|
|
|
8 | PSGC-4 | 613495 | 68.22 ( |
65.56 ( |
68.22 ( |
65.95 ( |
9 | PSGC-8 |
|
|
|
|
|
10 | PSGC-12 | 347623 | 91.08 ( |
91.15 ( |
91.08 ( |
90.98 ( |
11 | CSGC-4 |
|
|
|
|
|
12 | CSGC-8 | 364327 | 92.67 ( |
92.74 ( |
92.67 ( |
92.62 ( |
13 | CSGC-12 | 335095 | 92.78 ( |
92.86 ( |
92.78 ( |
92.73 ( |
Training curves for comparative methods. Curves are averages over ten trials, and error bands give standard deviations.
It can be seen from the results in Table
It is worth to mention that each synchrony measurement has its own merits (experiments 7, 9, and 11) when specifying the number of groups according to the hierarchical clustering. The synchrony group convolution based on composite synchrony is a compromise between phase synchrony and amplitude envelope synchrony, which achieves the best performance among all experiments when the group number equals 4.
The training curves for synchrony group convolutions are more stable in the initial few epochs and increase faster, which suggests that synchrony group convolutions are much easier to optimize. Besides, the model size of synchrony group convolutions is less than or equal to normal group convolutions when the number of groups is equal, owing to the unequal group sizes.
This paper proposes synchrony group convolutions for multichannel monitoring systems. The proposed method is able to gather signals with similar patterns and process them with specific groups of neurons, which could extract features more effectively. Signals with significant differences would be processed with different groups of neurons, instead of sharing the same neurons, which makes the model much easier to optimize. Experiment results show that the proposed model performs much better than normal convolutions and normal group convolutions on our task, which achieves an accuracy of 98.27% (
As an initial work on synchrony group convolutions, the proposed approach has achieved promising performance. One possible extension of this work is the joint optimization for the number of groups and the network structure to achieve better performance. Another difficult challenge is interpretable feature cluster and fusion for multichannel fault diagnosis system, which is essential for the evaluation of reliability and uncertainty. The authors would investigate this topic in future research.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that they have no conflicts of interest.