Preliminary Study on the Efficient Electrohysterogram Segments for Recognizing Uterine Contractions with Convolutional Neural Networks

Background Uterine contraction (UC) is the tightening and shortening of the uterine muscles which can indicate the progress of pregnancy towards delivery. Electrohysterogram (EHG), which reflects uterine electrical activities, has recently been studied for UC monitoring. In this paper, we aimed to evaluate different EHG segments for recognizing UCs using the convolutional neural network (CNN). Materials and Methods In the open-access Icelandic 16-electrode EHG database (122 recordings from 45 pregnant women), 7136 UC and 7136 non-UC EHG segments with the duration of 60 s were manually extracted from 107 recordings of 40 pregnant women to develop a CNN model. A fivefold cross-validation was applied to evaluate the CNN based on sensitivity (SE), specificity (SP), and accuracy (ACC). Then, 1056 UC and 1056 non-UC EHG segments were extracted from the other 15 recordings of 5 pregnant women. Furthermore, the developed CNN model was applied to identify UCs using different EHG segments with the durations of 10 s, 20 s, and 30 s. Results The CNN achieved the average SE, SP, and ACC of 0.82, 0.93, and 0.88 for a 60 s EHG segment. The EHG segments of 10 s, 20 s, and 30 s around the TOCO peak achieved higher SE and ACC than the other segments with the same duration. The values of SE from 20 s EHG segments around the TOCO peak were higher than those from 10 s to 30 s EHG segments on the same side of the TOCO peak. Conclusion The proposed method could be used to determine the efficient EHG segments for recognizing UC with the CNN.


Introduction
Uterine contraction (UC) is the tightening and shortening of the uterine muscles. UC can reflect the progress of pregnancy towards delivery and is a major observation for estimating the approach of delivery [1]. Electrohysterogram (EHG), which reflects uterine electrical activities, is a promising noninvasive technology for external UC monitoring [2]. However, it is still ambiguous which EHG segments are appropriate for recognizing UC.
Currently, four methods have been proposed to assess UCs. Manual palpation, which identifies UC by palpating the parturient abdomen over the uterine corpus, requires the constant bedside presence of a trained operator [1]. Internal uterine catheter (IUPC) is limited by its invasiveness and the need for ruptured membranes [3]. External tocodynamometry (TOCO) is noninvasive, but its recording quality depends on correct position of the sensor on the maternal abdomen and is influenced by maternal movements and the amount of subcutaneous fat [4]. Recently, EHG measurement has been considered a noninvasive method and an alternative approach of TOCO to monitor UC [5].
EHG features have been investigated to distinguish between UCs and non-UCs (non-uterine contractions) [6,7]. ese features have been extracted by power spectral density, wavelet packet decomposition [8], autoregression model [9], and other signal processing methods in the time and the frequency domain [1,7]. Nonlinear processes have also been involved in generating UCs because of the complex interactions between billions of myometrium cells [10]. erefore, nonlinear methods including time reversibility, sample entropy, Lyapunov exponents and delay vector variance [11], nonlinear interdependencies [10], and multifractal analysis [12,13] are useful for EHG analysis. Some advanced algorithms including the Hilbert transform, cross-correlation [14], correlation coefficient H 2 [5], mutual correlation dimension, cross-approximate entropy [15], and dynamic cumulative sum [16] have also been proposed for UC detection. Besides, classifiers including the support vector machine [17], random forest, and artificial neural network [7] have been developed for automatic UC detection using TOCO, cardiotocogram [18], and EHG signals. Even though some convincing results have been reported, there were discrepancies between them because of different data sources, feature selection algorithms, and classifiers applied [19,20]. e convolutional neural network (CNN) has recently been applied to obstetrics and gynecology [21] for classification of the fetal heart rate [22], electromyography [23], and electrocardiogram signals [24]. e CNN is a type of machine learning which can classify images and time series without additional feature extraction and selection and produce state-of-the-art recognition results. e outstanding classification capability of the CNN provides possibilities for detecting UCs with EHG images. e purpose of this study is to investigate the EHG segments appropriate for identifying UC. A CNN will be developed by EHG segments of 60 s and then utilized to evaluate EHG segments of 10 s, 20 s, and 30 s relative to the TOCO peak.

Materials and Methods
EHG signals were first manually segmented into UCs and non-UCs based on UC annotations and TOCO signals. 7136 UCs and 7136 non-UCs of 60 s duration were extracted from 107 recordings of 40 pregnant women and used to establish a CNN model. en, 1056 UCs and 1056 non-UCs were extracted from the other 15 recordings of 5 pregnant women. In particular, the EHG segments of 10 s, 20 s, and 30 s were classified as UC and non-UC using the established CNN model. e EHG segments of different durations were evaluated based on their sensitivity (SE), specificity (SP), and accuracy (ACC). In this study, a UC was divided into several small segments, and those with higher SE and ACC were considered efficient EHG segments for recognizing UC. e details of each step are shown in Figure 1.

Icelandic 16-Electrode EHG Database.
e open-access Icelandic 16-electrode EHG database contained 122 EHG recordings performed on 45 pregnant women, and some of them were measured more than once at Akureyri Primary Health Care Centre and Landspitali University Hospital between 2008 and 2010 in Iceland [25]. e database also provided simultaneously recorded tocographs, annotations of events, and obstetric information of participants. e participants had normal singleton pregnancies without any known preterm birth risk factors. A grid of 4 × 4 electrode was placed on the abdomen with the reference and ground electrodes on each side of the body (not standardized), as shown in Figure 2(a). Recordings were performed in the third trimester (112 recordings) and during labor (10 recordings). e average recording durations for pregnancy and labor were 61 and 36 min. e EHG signals were sampled at 200 Hz.

EHG Signal Preprocessing and Segmentation
. EHG signals were downsampled at 20 Hz and preprocessed by a 5th order Butterworth bandpass digital filter (0.1∼4 Hz) to remove the unwanted interference [20,26].
Each EHG signal was manually divided into UC and non-UC segments based on the UC annotation and TOCO signal [5,27]. e duration of the UC segment was symmetric around the TOCO peak for easy identification [4,28]. e corresponding non-UC was extracted between two UCs, as shown in Figure 2(b). In total, 7136 UCs and 7136 non-UCs of 60 s duration were extracted and confirmed by two clinicians. e extracted segments were discarded in case any clinician disagreed. en, the EHG segments with the duration of 10 s, 20 s, and 30 s were extracted from the left and right sides of the TOCO peak, as shown in Figure 3. Considering the time difference between the EHG recordings, annotations, and tocographs [28,29], twelve 10-second EHG segments (10_L1∼6 and 10_R1∼6), six 20-second EHG segments (20_L1∼3 and 20_R1∼3), and four 30-second EHG segments (30_L1∼2 and 30_R1∼2) with a total 120 s duration were extracted to contain UC segments as many as possible. 1056 UCs and 1056 non-UCs with different durations were extracted.
Finally, all EHG segments were saved as images and normalized to 482 × 482 pixels by resizing. Sixteen EHG images were obtained from 16-channel recordings for each UC and non-UC.

Convolutional Neural Network for Classification of EHG Segments.
e CNN is a specialized deep neural network for processing 1D time series and 2D images [24]. In this study, the CNN consisted of convolutional (Conv), maxpooling, fully connected (FC), local response normalization (LRN), dropout, and softmax layers and a rectified liner unit (ReLU), as shown in Figure 4. e Conv layer with the image size of length l and width w and the number of filters (m) denoted by l × w@m was used to extract features of the input image. e max-pooling layer downsampled the feature map and reduced the computational complexity. e number of neurons in the FC layer was denoted by num_output.
Every Conv and every FC were followed by a ReLU [24,30] which could be activated to speed up the training process. Behind a ReLU, the LRN layer detected high-frequency features and assigned them with large weights [31]. e parameters in the LRN layer were set as follows: the local_size value of 5, α value of 0.0001, and β value of 0.75 [32]. e dropout layer with half connection could reduce overfitting and improve regularization [30]. e batch gradient descent algorithm was applied to facilitate the CNN converge with the global optimum. Finally, the FC layer was connected to the softmax function (loss, shown in Table 1) to obtain the last output [22].
Stride refers to the number of samples that the filter slides over the input image. In the first layer, the size of the input image changed from 482 × 482@96 to 92 × 92@96 when the kernel was set to 27 and stride was set to 5. en, the image size decreased from 92 × 92@96 to 31 × 31@96 after the max-pooling layer. Subsequently, the size of the image was processed with a stride of 1, kernel of 2, and maxpooling layer to reduce its size from 30 × 30@256 to 15 × 15@ 256. After the third Conv layer, the image size was further reduced to 13 × 13@384. e Conv and max-pooling layers were once again performed on the output neuron of       ese were followed by FC layers with 4096 neurons. e final FC layer consisted of two neurons to classify UC and non-UC. e detailed parameters, including the kernel, stride, weight, and bias initialization of the CNN, are listed in Table 1, based on prior knowledge and manual tuning to achieve a satisfactory training result.
Different results were produced because of different hyperparameter values at each training of the CNN [22]. e repetition of each experiment process is called an "iteration" [33]. e standard deviation was set to 0.1, and some small positive values (0, 1) were added to the bias to avoid dead nodes [22]. e dropout (dropout_ratio) was set to 0.5 to gain the best results. Based on the results from the preliminary test, fine tuning was performed with the learning rate of 0.001, weight decay of 0.0005, learning rate drop factor of 0.1, learning rate drop period of 10, momentum of 0.9, gamma of 0.1, and maximum iteration of 20000.
e hyperparameters for training the CNN are shown in Table 2.
e CNN was run on a workstation with Linux Ubuntu 18.04 LTS Operating System and NVIDIA 1080 Ti GPU. e development environment was the CAFFE net framework, and the development language was MATLAB and Python.

Evaluation of CNN Model.
A fivefold cross-validation was utilized to evaluate the performance of the established CNN [34]. 7136 UCs and 7136 non-UCs were equally divided into five subsets, four of which were used to train the CNN model and the other was used to test the CNN model. is process was repeated five times. Furthermore, the training set included training (80% of the training set) and validation (20% of the training set) subsets, in which the validation subset was used to tune the hyperparameters of the CNN. SE, SP, and ACC were used to evaluate the classification performance, which were calculated as follows: where TP (true positive) and TN (true negative) are the numbers of UC and non-UC EHG segments that were correctly classified and FP (false positive) and FN (false negative) are the numbers of UC and non-UC EHG segments that were falsely classified. e results of SE, SP, and ACC from the fivefold cross-validation were calculated and averaged to evaluate the CNN. Furthermore, the established CNN was utilized to classify EHG segments of 10 s, 20 s, and 30 s to distinguish between UCs and non-UCs. e results of SE, SP, and ACC could indicate which EHG segments were efficient for recognizing UC with the CNN.

Statistical Analysis.
One-way ANOVA with Tukey's method was performed using the software SPSS 22 (SPSS Inc.) to compare SE, SP, and ACC between EHG segments with the same duration. A p value less than 0.05 was considered statistically significant.

Evaluation of CNN Performance with EHG Segments of 60 s Duration.
With the training set, the ACCs of five validations were 0.99, 1.00, 0.99, 0.98, and 0.99 and the loss ratios were 0.07%, 0.01%, 0.11%, 0.09%, and 0.08%. With the testing set, the averaged SE of 0.82, SP of 0.93, and ACC of 0.88 were achieved, as shown in Table 2. Table 3 shows the results from twelve 10-second EHG segments (10_L1∼6 and 10_R1∼6), six 20second EHG segments (20_L1∼3 and 20_R1∼3), and four 30-second EHG segments (30_L1∼2 and 30_R1∼2). As indicated in Table 3, the values of SE and ACC from the EHG segments around the TOCO peak (10_L1 and 10_R1, 20_L1 and 20_R1, and 30_L1 and 30_R1) were higher than those from the other segments of the same duration (10_L2∼6 and 10_R2∼6, 20_L2∼3 and 20_R2∼3, and 30_L2 and 30_R2) (comparisons on both sides of the TOCO peak, respectively). In contrast, the values of SP were similar among EHG segments of different durations. Besides, the values of SE from 20 s EHG segments around the TOCO peak (20_L1 and 20_R1) were higher than those from 10 s to 30 s EHG segments on the same side of the TOCO peak (10_L1, 30_L1 and 10_R1, 30_R1). Figure 5 shows SE from EHG segments of different durations. e range of the time difference between TOCO and EHG segments of UC at the start and end points is highlighted by shades of grey, and the mean of the time difference is denoted by a cross. In terms of 10 s duration, the SE values from 10_L1∼4 and 10_R1∼4 were significantly larger than those from 10_L6 (p < 0.05). In terms of 20 s duration, the SE values from 20_L1∼2 and 20_R1∼2 were significantly larger than those from 20_L3 (p < 0.05). In terms of 30 s duration, the SE values from 30_L1 and 30_R1 were significantly larger than those from 30_L2 (p < 0.05). No significant difference was found between the start and end points (p > 0.05).

Discussion
In this paper, a CNN model was built with UC and non-UC EHG segments of 60 s duration and then applied to recognize UCs using different EHG segments (different durations and different positions relative to a TOCO peak). e results indicate the efficient EHG segments that could be used to recognize UCs and monitor pregnancy progress in the future. To the best of our knowledge, this is the first study on the duration and position of EHG segments in distinguishing between UCs and non-UCs. e EHG segment corresponding to UC was investigated with the TOCO peak as a reference, which was easier to identify than the start and end points. In this study, the EHG segments from different channels and different gestational weeks were mixed together because of the small dataset. e duration of the EHG segment was selected based on the following consideration: (i) the duration of 10 s performed the best in identifying and tracking uterine activity across subjects [14] and (ii) most of the UC durations were no more than 60 s based on the Icelandic EHG database and clinical experience.
Several EHG analysis methods including the nonlinear correlation coefficient H 2 [5], cross-correlation [14], and root-mean-square envelope [28] have been proposed to improve the accuracy of UC detection. However, none of them concerned about the effects of the duration and position of the EHG segment on UC detection. Table 3 and Figure 5 show that EHG segments around the TOCO peak (10_L1 and 10_R1, 20_L1 and 20_R1, and 30_L1 and 30_R1) achieved higher SE and ACC than the others, indicating that they are more efficient for recognizing UCs. Furthermore, the SE of 20 s EHG segments (20_L1 and 20_R1) was better than that of other duration segments. e duration of 10 s and 30 s was supposed to contain a part of UC or additional non-UC which may influence its identification ability. We also observed different results on both sides of the TOCO peak, which might be due to the UC variation during pregnancy [35] and the imprecise synchronization between TOCO and EHG.
In this study, we focus on comparing EHG segments on each side of the TOCO peak separately because the rising and descending phases of the EHG segment reflect the tension and relaxation of the myometrium and may have different effects on recognizing UCs. Furthermore, recognition of UCs using EHG segments covering both sides of the TOCO peak (similar to 10_L1 + 10_L2 + 10_R1 + 10_R2) is also indispensable, which has been investigated in our previous study [36].
At the current stage, the EHG segments corresponding to UCs and non-UCs were first manually extracted to train a CNN model. e established CNN could then be applied to EHG segments determined by our study to automatically recognize UC; thereafter, the manual segmentation is no longer required. e ability to differentiate UCs from non-UCs could be improved with the efficient EHG segments. e clinicians agreed that our proposed method is very promising and could be applied to long-term UC monitoring in practice. e results were obtained from the combination of 16channel EHG signals because of the small dataset at present. We will work at reducing the number of EHG-recording electrodes for clinical application. e current CNN model was built by limited images from different gestational ages, and the ability of recognizing UC may vary depending on gestational age. More data on different gestational ages could be collected to build models in scale-up studies to eliminate the influence of different gestational ages and improve the  usability of the CNN technique. e signal-to-noise ratio (SNR) of the EHG also affected UC recognition. erefore, the effects of the EHG channel, gestational week, and SNR will be investigated to further improve UC detection.

Conclusion
e proposed method could be used to determine the efficient EHG segments for recognizing UC with the CNN. e results showed the EHG segments around the TOCO peak achieved higher SE and ACC than the others with the same duration, which indicated that they are efficient for UC detection.

Data Availability
e database used in this study is available for access via the following link: https://physionet.org/pn6/ehgdb/.