1D CNN-Based Intracranial Aneurysms Detection in 3D TOF-MRA

,


Introduction and Motivation
Intracranial Aneurysms (IA) are an important cause of high morbidity and mortality of cardiovascular diseases [1,2].It has been reported that about 3% of healthy adults have Intracranial Aneurysms (IAs) [2].Rupture of IAs is the major cause of subarachnoid hemorrhage, which often leads to severe neurological sequelae and even death [3].Presently, Digital Subtraction Angiography (DSA) based on X-ray is still regarded as the golden standard in diagnosing IAs, since it can accurately reflect the location, scope, and degree of IAs.Its sensitivity is more than 95% [4,5].Nonetheless, DSA may lead to neurological problems due to its invasiveness [6].Different from DSA, ree-Dimension Time of Flight Magnetic Resonance Angiography (3D TOF MRA) has the advantages of noninvasive, radiation-free, and high sensitivity.As such, it has been widely used to detect aneurysms [7][8][9].However, it is a challenging and laborious task for radiologists to detect the unruptured aneurysms through observing the MRA images frame by frame.Moreover, the sensitivity of manual diagnosis is only about 64-70% [10].
Computer Aided Diagnosis (CAD) has been widely employed to help doctors to do IAs [1,11,12].e earlier CAD relied on the low-level or hand-crafted features.e effectiveness of these features depends on the specific domain knowledge of the designers.Generally, the results based on hand-crafted features are not robust and universal.To overcome the obstacle, the feature learning-based methods were proposed, which means that the feature detection algorithm is learned from the existing samples rather than being established by researchers in the domain.Recently, the Convolutional Neural Network (CNN) has been increasingly applied to the detection, classification, and segmentation of medical images [13][14][15][16].
e 3D CNNbased method utilizes the spatial structure information of volume and performs excellently through inputting the 3D image into CNN directly.Sichtermann established an IAs detection system on basis of a 3D CNN in [17]. is system utilized sufficient pretreatments and post-treatments to reduce the number of false positives and achieve high detection sensitivity.Allison used Computer Tomograph Angiography (CTA) images of the brain to construct a HeadXNet model to segment aneurysms [18], which predicted aneurysms with high sensitivity.Bio Joo conducted IAs detection based on the 3D ResNet leading to better result [19].A cascade strategy was proposed to automatically detect Cerebral MicroBleeds (CMBs) from MR images using 3D CNN [20].e detected CMBs in the second stage could be utilized as a reference for aneurysm detection.e 3D CNNbased methods could undoubtedly achieve excellent performance.However, the difficulty in obtaining lots of 3D medical images limits the practical applications.e Maximum Intensity Projection (MIP) strategy projects the original volume into 2D images to reduce the demands for samples in CNN-based detection and speed up the CNN training.Based on the above strategy, Nakao generated 9 MIP images and concatenated them into a new image as input and then do classification using 2D CNN in [21].ese MIP images should contain the main structural information of the original volume.As such, it can obtain the accuracy of 95%, which is comparable to the 3D CNN-based method.Stember trained a U-net CNN with 250 MIP images to predict the size of aneurysms.ough its results are fine, but it is still limited in some special cases [22].Duan proposed a secondary cascade CNN architecture to detect aneurysms [23].Ueda utilized ResNet-18 to detect aneurysms, which had the sensitivity of 93% and 91% for internal and external datasets, respectively [24].
Compared with the 3D CNN-based method, the MIPbased methods greatly reduce the requirement for computer performance and the number of samples [21].However, its demands are still considerable.erefore, we intend to propose a new solution to convert 2D CNN-based method into the 1D case.e main idea is that the MIP image will be projected into some 1D vectors, and the 2D CNN is also replaced by a 1D one.at is, we further reduce the dimension of the input data according to the way of generating MIP in classification accordingly.As such, this new strategy further reduces the redundant information and simplifies the structure of the network.In addition, the 1D CNN can be trained on the CPU while 2D or 3D CNN should be trained using GPU in general situation.It means that the demand for computer's performance is greatly reduced.As such, the contributions of this paper lie in the following.(1) e MIP strategy is extended to the 2D case, and 1D vectors are used to represent the 2D image which actually reduces the dimension of the original data.(2) Compared with the MIPbased method, the 1D CNN is more efficient because it requires fewer samples and shorter training time, while the accuracy is retained.( 3) is way can be extended to other pattern recognition problems.

e Basic Idea.
e proposed strategy is illustrated as Figure 1.In 3D TOF MRA volume, we first extract a series of voxels along the cerebral vessel by thresholding the original 3D image.For each voxel, we segmented a 3D patch centered at that voxel from the 3D TOF MRA volume.e patch is called as the Region of Interest (ROI).For each ROI, some MIP images will be generated by projecting the ROI along several directions [21].Note that MIP image contains the main features of the original volume.Accordingly, several 1D vectors (sizes are m) can be obtained by accumulating the pixels on the MIP image along different direction.Assume that we conduct the accumulation along n directions.n 1D vectors with size of m can then be obtained for each ROI.Afterwards, these 1D vectors are concatenated into a new 1D vector whose length is m * n.In our experiment, we generate nine MIP images for each ROI.As such, there will be nine concatenated 1D vectors with length m * n for each ROI.
en, these nine concatenated vectors are connected into a 9 * m * n 1D vector in Figure 1.Meanwhile, we establish a 1D CNN for aneurysm detection, whose input is a 9 * m * n 1D vector and the output is 1 or 0 depending whether there is an aneurysm or not in the considered patch.After training the 1D CNN, we can use it to detect aneurysm from the TOF MRA volume.In short, we further project the MIP images into 1D vectors and establish the related 1D CNN for aneurysm detection, which reduces the requirement for samples and improves the efficiency of training.
e direct way for IAs detection from 3D TOF-MRA image is to establish a 3D CNN, whose input is the original 3D ROI patch.e MIP-based method takes the several 2D images as the input and replaces 3D CNN with a 2D one which reduces the dimension of the input data and the amount of calculation.By comparison, 1D vectors, obtained from the MIP images shown in the bottom of Figure 1, are taken as the input of 1D CNN to do detection or classification.It is clear that the demands for samples are reduced and the training computations are lessened in case of 1D CNN-based aneurysm detection.Meanwhile, the main characters of the original 3D patch are retained in the MIP images and the main characters of MIP images are remained in the generated 1D vectors.It means that the 1D vectors retain the main characters of the 3D patch, and the classification accuracy should be close to the 3D case.

Data Preprocessing.
In generating the samples, some preprocessing operations are required for the original 3D TOF MRA volume.It is common that the volume may be anisotropic and the intensities of the volumes captured by different machines may be diverse.erefore, resampling 2 Complexity and normalization are conducted for all slices of the volume.e classic bicubic interpolation algorithm is taken here to do resampling to obtain isotropic volume [25].Meanwhile, we the grayscale stretch is applied to improve the homogeneity of these slices.We utilize a piecewise linear transformation function as equation (1) to make the gray value distribution be homogeneous: In equation ( 1), x and y are the gray value of the original and stretched image, max and min denote the maximum and minimum gray of the original image, and parameters a and b determine the slope of the gray stretching.
IAs are attached to the blood vessels, which have obvious different intensity from other tissues in brain MRA image.erefore, we can approximately separate the blood vessels from other tissues on the basis of a threshold.e OSTU and Hessian matrix are utilized to fulfill this task [26,27].en, we further judge whether there are IAs in the 3D TOF MRA data only taking the blood vessel into account.It should be noted that all positive samples contain at least one intracranial aneurysm, and all negative samples do not have one Complexity intracranial aneurysm.en, a 3D patch of size 16 * 16 * 16 centered at the voxel of the detected cerebral vessel is sliding along the vessels to generate a series of ROIs.A series of ROIs are produced while the patch center is sliding along the cerebral vessels.For all ROIs, these with intracranial aneurysms are taken as positive samples and the ROIs nearby the circle of Wills without intracranial aneurysms are considered as negative samples [28].e process of obtaining the ROI is shown in Figure 2.

Generation of 1D
Vector.After data preprocessing, we obtain a series of normalized ROI, and we get nine MIP images for each ROI through nine projections [21].For each MIP image, we adopt a new strategy called Pixel Accumulation Projection (PAP) to generate 1D vector by accumulating the pixels in a direction.It is clear that the generated 1D vectors will contain the main information of the corresponding MIP image.at is, the MIP image can be reconstructed by these projection vectors.e schematic and formula of the accumulation way are, respectively, shown in Figure 3 and equation ( 2). e MIP image is first expanded to its circumcircle by padding with zero.ese zero pixels will not affect the accumulation value in generating 1D vector.en, the circumcircle image is projected to obtain four 1D vectors along four different directions by accumulating the pixels as follows: where n represents that there are n pixels in the direction and x k denotes the value of the kth pixel in the direction.It should be noted that the generated four vectors are resized to be 16 and then concatenated into a 64 1D vector.As such, nine MIP images mentioned above are further simplified into nine concatenated 1D vectors.e whole process of 1D vector generation is illustrated in Figures 4 and 5.As shown in Figure 6, the possibility of the considered MRA containing intracranial aneurysms will be less when the generated vector is smooth (the vector is shown as a curve), and vice versa.In our experiment, the original volume (shape 1 ) is first projected into nine MIP images (shape 2 ), which are further simplified into nine concatenated 1D vectors with the length of 64 (shape 3 ).ese nine 1D vectors also can be concatenated again to a 576 1D vectors as (shape 4 ).e sizes of different data are shown in Table 1 and Figure 5. shape 1 is the 3D image and shape 2 corresponds to the MIP image.shape 3 denotes to the four vectors in middle of Figure 4 and shape 4 means the concatenated vector with a size of 576.

1D Convolutional Neural Network.
Based on the previous discussion, our proposed method is dedicated to reduce computation in training and the demand for samples.To verify the efficiency and accuracy of the proposed method, two 1D CNNs are used in our experiments.e overall structures of the two models are shown in Figures 7  and 8. e Model-1 is an 11-layer CNN which includes four convolutional layers, four maximum pooling layers, and three fully connected layers.Its input is a 1D vector with 576 elements, as shown in Figure 5. e Model-2 is a multichannel network, where nine inputs are connected to two fully connected layers via two convolutional layers and two maximum pooling layers.e parameters of the two models are listed in Tables 2 and 3, respectively.
In the convolutional layers, the network is designed to extract the aneurysm features, which is trained using the backpropagation algorithm.e weight-sharing mechanism is applied to the neurons located on the same feature map to reduce the risk of overfitting.
e pooling mechanism is implemented by calculating the maximum or average value of the convolution features between adjacent neurons in the previous convolutional layers.e maximum pooling layers are utilized to reduce the dimension of feature extraction.We use several convolutional layers and maximum pooling layers to extract the features of 1D vectors.Finally, the output layer after the fully connected layers takes the softmax activation function to predict the probability of aneurysm.e binary crossentropy between expectation and reality is taken as the loss function in training the CNN.e binary crossentropy is where p is the prediction and t denotes the label.To avoid overfitting, Batch Normalization (BN) is utilized to accelerate network convergence [29].e Rectifier Linear Unit (ReLU) is then employed as the nonlinear activation function in the convolutional layer and fully connected layer [30].e 1D convolution kernels are initialized as the way in [31].ReLU can be denoted as follows: ReLU(x) � max(x, 0).(4) e number of intracranial aneurysms in all samples is 187 with the average diameter of 7.5 mm. e minimum and maximum diameters are 2.4 and 23 mm. Figure 9 shows the diameter's distribution.It can be found that most of them are between 5 mm and 10 mm.Under the guidance of radiologists, 17 undistinguishable samples are removed and the remaining 170 available samples are taken as positive samples.Meanwhile, the 180 negative samples without intracranial aneurysms are tailored from the original volume.

Data Argumentation.
To expand the training samples, we perform translation and rotation for the original volume.
e MIP images after augmenting by the two strategies are shown in Figure 10.Compared with the translation-based method, more diverse samples can be obtained by rotating the volume.After rotating the volume with several angles, we tailor ROIs from the rotated volumes to obtain MIP images.
e specific steps are as follows.
(1) e original DICOM images are resampled after preprocessing to make the TOF MRA volume isotropic according to [32].(2) e blood vessels are detected based on OTSU algorithm [26].4.

Evaluation Metrics.
Four metrics are usually used in evaluating the binary classification.TP indicates the proportion that the positive sample is correctly predicted as positive, FN means the ratio that the positive sample is incorrectly classified as negative, FP denotes proportion that the negative sample is wrongly predicted to be positive, and TN represents the ratio that the negative sample is true classified to be negative.In these experiments, the following four metrics are employed to quantitatively evaluate the performance of different methods on basis of above binary classification metrics.
ey are accuracy, precision, recall (also called sensitivity), and F1 score [33].Accuracy denotes the proportion of the correct classification, precision refers to the correct proportion of positive samples, sensitivity indicates that the proportion of positive samples is classified as positive, and F1 score is the average between precision and sensitivity.
ese metrics could be computed by the following equations:  ( In addition, Receiver Operating Characteristic (ROC) curves with Area Under the Curve (AUC) is also employed to evaluate the performance of different methods [34].Taking the proportion of FP and sensitivity as the abscissa and ordinate, respectively, ROC can effectively express the case of TP and FP in different cases of thresholds.By calculating the area under the ROC curve, AUC can represent the portion that the predicted positive sample precedes the negative sample.It is noted that the closer that AUC is to 1, the better the result is.e value of AUC is computed as follows: where rank i is taken to the ith sample in a probabilistic order, p represents the positive class, and m and n are the number of the positive and negative samples.

Experiment Result
e proposed method projects each MIP image to several 1D vectors by accumulating the pixels in several directions and then establishes a 1D CNN to detect intracranial aneurysms.It is to accelerate the training and reduce the demand for the samples.To validate the effectiveness of the proposed method, three kinds of experiments are conducted.

Training with Different Number of 1D Vectors.
It is clear that the accuracy of aneurysm detection is lower in case of fewer projections (1D vectors).However, fewer 1D vectors means that the training can be highly accelerated.To trade off the accuracy and efficiency, an appropriate number of projections will be investigated.Specifically, three groups of 1D vectors are generated by projecting MIP image along 2, 4, and 8 directions, and they are divided into training, validation, and test sets.It is worth noting that two kinds of samples are generated for each case, which correspond to shape 3 and shape 4 listed in Table 1.
en, 9800 samples (4760 positive samples and 5040 negative samples) are taken to train two models shown in Figures 7 and 8 with GPU.
After training, we test on 2800 test samples (1360 positive samples and 1440 negative samples).eir results are shown in Table 5.
e ROC curves under different projection configurations are illustrated as Figures 11 and 12.It can be found that the performance and computation of two models both increase in case of more inputs.at is, the performance of 4 projections is significantly better than the case of 2. Compared with the latter, the accuracy and precision are raised by about 2 and 3%.Yet, their computation amounts are close.However, the performances will not increase obviously while the projection number reaches 8. e possible reason is that more projections could not detect more features from the MIP image, but it causes more computation.Considering them, the 4 projections are taken in the following experiments.

Training with Different Number of Samples.
To verify the effectiveness and robustness of the proposed method, the above two 1D networks are utilized to conduct two  Complexity comparative experiments with the same original samples in cases of different argumentations.Meanwhile, two corresponding 2D CNNs with the similar architectures are also trained using the corresponding MIP images.ese training tasks are conducted using GPU and CPU. e results are shown in Tables 6 and 7. e more the samples are, the better the results are for all cases.When training data becomes less (data 1 , data 2 , and data 3 listed in Table 4), the performance of the 1D CNNs are better than that of 2D CNN. e accuracy of 1D CNN is about 1% to 4% higher than the latter, and the precision and sensitivity are also slightly higher.It can be found that the 2D CNN is more sensitive to the number of samples.at is, there is no significant difference among these models, especially when a large number of training samples such as data 6 and data 7 are utilized.In the case of training with GPU, the cost of 1D CNNs of each epoch is less than these of 2DCNNs.Moreover, the GPU utilization rate for 1D CNN is about 1/3 to 2/5 of that of 2D CNN.When training with CPU, the CPU is almost 100% employed and the time in training 1D CNN are much less than that of MIP image-based case.It is because that 1D CNN effectively reduces the size of data and then simplifies the network.ese experiments show that the proposed method is effective and robust especially when the samples are rare.

Compared with Other Methods.
is section will compare our two 1D CNNs with two classic models [20,21].e first model is composed of two convolution layers, two maximum pooling layers, and three two-connected layers.Its input is the original ROI sample as the shape 1 and the output is the probability whether an aneurysm is located in this ROI.e second one employs the concatenated MIP images to detect aneurysms on basis of 2D CNN. 9 MIP images are vertically concatenated as the input of the network, which consists of two convolution layers, two maximum pooling layers, and two fully connected layers.Our first model is shown in Figure 7.It is an 11-layer 1D CNN with four convolution layers, four maximum pooling layers, and two fully connected layers.e input of this model is a 576 1D vector as the shape 4 .e last one employs a multichannel network model with the input of nine 64 1D vectors (shape 3 ).For each channel, a 4-layer CNN composed by two convolution layers and two maximum pooling layers is utilized to extract aneurysm features.After concatenating these features, two fully connected layers are used to do classification.
After training the four models with 9800 samples on GPU, the results are listed in Table 8. e ROC curves of these methods are denoted in Figure 13.e method of ours (con-input) means the result of the third model is trained with the concatenated 1D vector (shape 4 in Table 1), while ours (9 inputs) denotes that of the last model trained by nine 64 1D vectors (shape 3 in Table 1).It is clear that the performances of our proposed method are similar to [20], which are slightly better than that in [21].e model with multichannel outperforms other three strategies, especially in terms of sensitivity.In addition, the calculation of our method becomes easy in case of GPU training.

Discussion
ere are many research studies about intracranial aneurysm detection and classification on the basis of the 2D or 3D CNN [18,19,21,24].To the best of our knowledge, however, this is the first time that 1D CNN based is used in this application.Inspired by the MIP, the 1D vectors are generated by accumulating pixels of the 2D MIP image in certain directions.Similar to the principle of CT imaging, these 1D vectors contain the main features of the related MIP images.What is important is that 1D CNN replaces the traditional 2D CNN, which greatly reduces the number of parameters and simplifies the training complexity.Meanwhile, we compare the proposed strategy with the traditional 2D CNN-based methods.It has been proved that the results of our method are close to or even better than those based on 2D CNN.Moreover, our method can conduct the training with fewer samples on the CPU platform.In contrast, 2D CNN training requires more training samples and highperformance GPU. at is, the proposed method reduces the demands for samples and computer performance.In addition, by comparing our proposed strategy with the existing traditional methods, it is clear that our method performs equally well.Obviously, the proposed strategy can be extent to other tasks including the feature detection and image classification.In our experiment, all the original data are augmented by 3D rotation.Traditional two-dimensional translation and rotation are common methods of data expansion.However, limited by the rotation angle and translation distance, the effect of these methods is not ideal for our small data.e augmented images show a strong correlation with the original image.In our application, 3D rotation is applied to increase the original data, which makes the augmented data more diverse.Obviously, the 3D rotation method can be applied for more data augmentation processes.It should be admitted that the data lost information in the process of dimensionality reduction.On the contrary, the removal of redundant information makes the detection and classification easier.In the proposed method, the number of 1D vectors has been demonstrated to be a key factor in the controlled experiments.
e more the 1D vectors are, the more feature information of the data will be retained and the classification result will be better.Yet, more 1D vectors will increase the training computation.en, a balance should be determined in specific applications.

Conclusion
is paper proposes an IAs detection by introducing the 1D CNN to do classification.Inspired by the MIP-based IAs detection method, we further generate several 1D vectors from each MIP image and then input the 1D vectors to the 1D CNN to determine whether there is an IA or not in the considered 3D image patch, i.e., the MIP strategy-based method transfers the traditional 3D intracranial aneurysms detection into 2D image classification problem and we then transfer it into a 1D case.e size of the one-dimensional vector generated by our method is 14.06% of the original 3D data and 25% of the 2D MIP image.Correspondingly, the parameter of a 1D network is 44.23% of that of a 2D network.As such, this way greatly accelerates the training process and reduces the demands for samples in training the CNN.According to the experiments, our strategy is simple and effective.It should be mentioned that the proposed strategy achieves 95.86% detection accuracy, which is even better than the traditional methods.
e efficiency of the proposed method outperforms the classic MIP about 10 times in CPU training.is makes it possible to train with CPU in the clinical settings.In addition, this strategy can be used to other high 2D-and 3D-image-related applications which may greatly reduce the difficulty in training.Meanwhile, there are other issues to deserve further attention.e first is that how to generate the 1D vector from the MIP image more efficiently.is paper simply uses the way of accumulating all pixels' intensities along each direction.is is just an example.Other sophisticated ways need to be researched.e second is that 1D CNN structure and optimization should be analyzed to further improve the classification accuracy.Future work should involve clinical data acquisition, preprocessing, etc.

Figure 1 :
Figure 1: An overview framework for 1D CNN-based intracranial aneurysm detection.

( 3 )
Taking the center of the aneurysm as origin, the TOF MRA volumes are randomly rotated around x-, y-and z-axis with plus or minus 0, 5, 10, 15, 20, 25, or 30 degrees to augment 40, 30, 20, 10, 5, and 3 times.(4) A series of ROIs with size of 16 * 16 * 16 are tailored from the rotated volumes.e augmented samples are divided into training, validation, and test sets according to the way of Table

Figure 11 :Figure 12 :
Figure 11: ROC curves of Model 1 with different projection number configurations.

Table 1 :
e four kinds of data.

Table 2 :
e architectures of 1D CNN discrimination Model 1.

Table 3 :
e architectures of 1D CNN discrimination Model 2.

Table 5 :
Results of different projection number configurations.

Table 4 :
e details of the dataset.

Table 6 :
Detection results of Model 1 under different samples.

Table 7 :
Detection results of Model 2 under different samples.

Table 8 :
Comparison with other methods.