A Fault Detection Method of Memristor in Chaotic Circuit Based on Artificial Neural Network

Memristor is the fourth basic circuit component after the three basic circuit components of resistance, capacitance and inductance. It has the characteristics of nonlinear characteristics and memory function and shows great application prospects in the ﬁ elds of memory, neural network, logic operation, chaotic circuit and so on. The appearance of memristor provides a new choice for the circuit realization of chaotic system. The chaotic circuit based on memristor has di ﬀ erent characteristics compared with the chaotic circuit constructed by general components. Therefore, we studied the chaotic circuit based on memristor. It will be of great signi ﬁ cance to the application of memristive chaotic attractors in practical engineering. In order to improve the accuracy and e ﬀ ectiveness of memristors in various circuits, the method of combining the arti ﬁ cial neural network and fuzzy analysis is used for fault detection of memristors in chaotic circuits, which can accurately judge whether the memristors in the circuit are faulty. The work of this paper is as follows: (1) development status of memristors and the detection methods related to memristor faults are introduced in detail. This paper also introduces and summarizes the development of machine learning, especially neural networks considering references of related theories in memristor fault detection. (2) Introduced the relevant theory of CNN and proposed to use one-dimensional CNN for fault diagnosis. Based on this, an improved method, namely, arti ﬁ cial feature enhancement model, was proposed. (3) The memristor fault detection data set was designed, and the validity of the structure and parameters of the selected CNN is veri ﬁ ed. Finally, the e ﬀ ectiveness and superiority of the improvement are veri ﬁ ed by the comparison of the CNN model and the arti ﬁ cial feature enhancement model.


Introduction
Before 1971, it was widely believed that basic circuit elements consisted of only three well-known components, including resistors, capacitors, and inductors. However, after Professor Cai published the paper "Memristor-The Missing Circuit Element," the discovery of memristor changed people's perception of traditional electronic circuits [1]. A memristor is a nonlinear passive two-port element that characterizes the relationship between charge and magnetic flux. As the fourth most basic circuit element to be missed, the memristor has unique characteristics that the other three basic circuit elements do not have. The most special of these is that the memristor has a memory function, and its own resistance depends on the amount of charge flowing through it, so it can memorize the current flowing through itself, so it is called "memory resistor" [2]. Although Professor Cai proposed the existence of memristors, the actual memristors were not discovered at that time. It was not until 2008 that HP Labs realized the first real memristor [3]. With the physical realization of memristors, the research of memristors has once again attracted extensive attention of scholars. Due to the unique memory characteristics of memristors, it has broad application prospects in many fields such as model analysis, circuit design, industrial engineering, and biological memory behavior simulation [4][5][6][7]. At present, scholars' research on memristors mainly exists in two aspects. On the one hand, along the research direction of Hewlett-Packard Laboratory, we consider how to realize more economical nanomemristor devices with memory characteristics, so that memristors can be commercialized [8]. Another aspect is research on memristor applications, such as nonvolatile storage, secure communications, neural networks, signal processing, sample recognition, process control, radio frequency identification, biosensors, and chaotic circuits [9]. Chaos theory was first proposed by American meteorologists in 1963. Due to the nonlinearity, diversity, and multiscale nature of the chaotic system, it can produce rich dynamic behavior, so it has a very important role in the field of secure communication. As a nonlinear element, the memristor can easily generate chaotic oscillations. Memristive chaotic circuits, as an important branch of memristor research, have been extensively discussed [10]. As early as 2008, professor Ito and professor Cai used memristor to replace the diode in the traditional Chua's chaotic system and constructed several chaotic systems based on memristor, which is the first combination of memristor and chaotic circuit. In 2009, a professor at the University of California, Berkeley, proposed a new chaotic circuit based on memristors, in which the dimension of the circuit is higher than that of the traditional Chua's circuit [11]. In 2010, a professor proposed a smooth memristive oscillator based on the traditional Chua's oscillator, which can exhibit transient chaotic phenomena [12]. Chua's diodes are replaced by smooth flux-controlled memristors and negative conductance. By setting different parameters and initial conditions, the circuit exhibits rich dynamic behaviors during intermittent periods, such as temporary chaos or periodicity. Furthermore, reference [13] established a simple memristive chaotic circuit in which an inductor and a capacitor are connected in series, and one of the capacitors cascades a memristor. Reference [14] constructed a memristive chaotic circuit composed of four elements, including a memristor, an inductor, a capacitor, and a linear negative resistor. It can be seen that the memristor plays an irreplaceable role in the chaotic circuit, so how to detect the fault of the memristor is also an extremely important task. In this paper, based on the characteristics of memristors in chaotic circuits, ANNs are used to detect the faults of memristors.
The paper implements machine learning techniques in the form of neural networks for the detection of memristor faults. The unique contribution of the paper includes the following: (i) An exhaustive literature review of various forms and implementations of memristor models (ii) Implementation of a one-dimensional CNN model for fault detection of memristors and then based on the issues identified, an enhanced AI-based model is proposed (iii) The proposed enhanced model is evaluated against the CNN model to justify its superiority The rest of the paper is organized as follows. Section 2 presents a detailed review of studies done relevant to memristor models. Section 3 presents the methodology, Section 4 discusses the experimental analysis and results, and finally, the conclusion is discussed in Section 5.

Related Work
As a new nonlinear basic circuit component, memristor explores and studies its source smooth memristor model in the field of chaotic circuits, consisting of a capacitor, a negative inductance, a negative resistance, and a negative capacitor. The dynamic characteristics are analyzed, and the simulation results and experimental results show good consistency, but the microcontroller module, AD/DA conversion module, digital signal isolation module, and related programming control are used in the realization of the circuit. The circuit implementation is more complicated [15]. In the same year, reference [16] proposed a fifth-order memristive chaotic system based on Chua's chaotic oscillator circuit by adding two smooth memristor models to the circuit, which can generate memristive chaotic attractors and it has infinitely many equilibrium points but fails to realize the hardware circuit of the system. Reference [17] proposed a memristive chaotic oscillatory system and carried out detailed research and numerical simulation analysis on the basic dynamic characteristics of the system but lacked hardware circuit implementation. In 2012, reference [18] connected two HP memristor models in reverse and then replaced the Chua's diode in Chua's circuit, proposed a fifth-order chaotic system, and simulated and analyzed the basic dynamic characteristics of the system but failed to implement hardware circuit for the system. In the same year, Pham et al. proposed a new memristive chaotic circuit by combining the smooth memristor model with the time delay circuit. The circuit can generate a single scroll chaotic attractor, but the circuit uses many circuit components. The circuit structure is complex [19]. Reference [20] proposed a new third-order chaotic system based on HP memristor model and carried out theoretical analysis and SPICE simulation analysis on the basic dynamic characteristics of the system. Reference [21] proposed a series of memristive chaotic systems based on Chua's oscillation circuit by connecting two HP memristor models in reverse and then replacing the circuit components in Chua's oscillation circuit and analyzed the basic dynamics of each system. The simulation analysis is carried out on the characteristics of science, but the hardware circuit implementation is lacking. Reference [22] proposed a fourdimensional memristive chaotic system based on the piecewise linear memristor model. The simulation analysis by MATLAB shows that the system can generate hyperchaotic attractors, but the hardware circuit of the system cannot be realized. Reference [23] put a smooth memristor model in parallel with a negative resistance and then replaced the nonlinear resistance in Chua's circuit, and proposed a new four-dimensional memristive chaotic circuit, and the basic dynamic characteristics of the circuit are simulated and verified by hardware circuit experiments. Reference [24] proposed a memristive chaotic system based on a four-piece linear memristor model. Numerical simulation analysis shows that the memristive chaotic system can generate complex dynamic behaviors, but the system lacks hardware circuit implementation. Reference [25] proposed a chaotic oscillator based on HP 2 Wireless Communications and Mobile Computing memristor model, and the basic dynamic characteristics of the oscillator are analyzed theoretically and ADS simulation analysis is carried out using discrete wavelet transform, but the hardware circuit implementation of the proposed chaotic oscillator is not possible. Reference [26] proposed a memristor chaotic system that can generate two-scroll and four-scroll chaotic attractors. The circuit topology of the system uses a memristor model with a quartic polynomial memristor but lack of hardware circuit implementation for the system. Reference [27] studied the application of piecewise linear memristor model and smooth memristor model in fractional-order chaotic systems and proposed a fractional-order memristive chaotic system based on the Lorez system, which has a great impact on the basic dynamics of the system. The characteristics are analyzed by numerical simulation, but the system lacks hardware circuit realization.
Troubleshooting is a very important issue in circuit design and testing. Fault diagnosis usually involves testing circuit faults, identifying faulty components, and verifying their parameters [28]. Now, special attention has been paid to the diagnosis of soft faults, which generally occur if a parameter of a device deviates slightly from its tolerance range. For analog circuits and systems, soft fault diagnosis is often very difficult and complex due to tolerance effects and device nonlinearity. In order to overcome these problems, many emerging diagnostic methods have been proposed. The neural network-based diagnostic methods, such as reference [29], proposed a method of applying neural networks to the fault diagnosis of practical circuits. Reference [30] used a wavelet neural network as a preprocessor and proposed a modular diagnostic system to diagnose analog circuits. Reference [31] proposed a new soft fault diagnosis method for analog circuits based on slope fault features and the BP neural network. Reference [32] uses the node voltage sensitivity sequence to diagnose faults in circuits. References [33,34] use fuzzy theory to propose a method for diagnosing circuit faults under tolerance conditions. It can be seen that there are many methods for the diagnosis of analog circuit faults. The study in [35] implemented two different approaches to incorporate memristor simulation models. In the first approach, Strukov's model was used, and in the second case, equivalent Weiner model approximation was implemented. The dynamic properties of the model were compared, and their impact on memory and nonlinear processing capability was analyzed. The study in [36][37][38] proposed a programmable ultraefficient memristor-based accelerator (PUMA) had helped to enhance the memristor crossbars having general purpose execution units. The objective was to accelerate the various forms of machine learning inference workloads. However, there are currently few studies related to fault diagnosis of multiple memristor circuits, although the effectiveness of nanoscale composite circuits has been demonstrated in many fields, especially for the realization of artificial synapses. In order to improve the accuracy and effectiveness of memristors in various applications, methods for fault diagnosis of memristor circuits are highly desirable.

Method
3.1. Convolutional Neural Network Structure. There are five major schools of machine learning, including symbolicism, Bayesianism, connectionism, evolutionism, and behavioral analogy. The neural network method used in this paper belongs to the connectionism school. In recent years, with the improvement of computer performance and the popularity of parallel computing methods, neural network algorithms have emerged. The basic structure is shown in Figure 1. CNNs have become the mainstream research direction of computer vision in the past decade, especially in September 2015, the "deep residual network" developed by Microsoft Research Asia achieved amazing results and won the championship in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) competition including image classification, positioning, and detection. The ILSVRC helps to evaluate algorithms for the detection of objects and classification of images at a large scale. The challenge motivates researchers to compare progresses in the detection of various types of objects. The rate is reduced to about 5%, which is lower than the human eye recognition error rate. So far, the powerful learning ability of CNN has been verified, and its robust fitting and feature learning ability gives excellent answers and ideas for the bottleneck issue of pattern recognition and has been effectively used to a range of pattern categorization challenges.
The invention of the CNN benefited from the research on the visual neural structure of cats and was first used in computer vision-related research and achieved good results. Its name originates from the unique convolution layer compared to other neural networks, which uses a local convolution kernel to scan the entire receptive area and performs a convolution operation in the calculation process. The invention of convolutional layers greatly reduces the number of neural network parameters and the computational complexity of the learning process. Since its multiple high-level neurons calculate the same weights on the bottom-level outputs, only the convolution kernel positions are different. If the full connection method is adopted, each high-level neuron needs to calculate its matrix independently, and this structure shares the weight matrix, so it is also called the weight sharing layer. Due to the weight sharing and other characteristics, the parameter calculation of the neural network is greatly reduced, the risk of overfitting is reduced, and the performance of large deep neural networks is greatly improved. The main structure of CNN can be further divided into the convolutional layer, pooling layer, and fully connected layer, among which the convolutional layer and pooling layer often appear alternately. The input of the convolutional network in this paper is one-dimensional data. After the layer-by-layer network is calculated, the output is whether it is faulty [39].
3.1.1. Convolutional Layers. The convolutional layer is usually used for the initial feature extraction, so that multiple convolution kernels are convolved with the input data, and the obtained high-dimensional data is then introduced into nonlinearity through the activation function to obtain a 3 Wireless Communications and Mobile Computing series of feature maps. The mathematical formula for the convolution process is where E k j represents the jth element of the kth layer, w k ij represents the convolution kernel of the layer, and p k j is the bias matrix of the layer. f is the activation function, and the matrix data obtained before is passed through the activation function to enable the convolutional layer to extract nonlinear features. In this paper, the activation function adopts the ReLu function, and the mathematical expression is The activation function can simplify the calculation process and effectively avoid the gradient dispersion or explosion phenomenon in the learning process [40].

Pooling
Layer. The pooling layer is commonly employed to minimize the size of the extracted feature map, simplify the computation, and ensure that the collected features are translation invariant. Mean pooling, median pooling, and maximum pooling are the most often utilized pooling layers. The maximum pooling approach is used in this paper's neural network. For each area, the largest value is used to build the next layer of feature maps, which are then pooled together to produce the final feature map [40].

Fully Connected
Layer. The feature maps from the convolutional and pooling layers are recombined in the fully connected layer to provide diagnostic findings. This layer's neurons are interconnected with each other and with the prior layer's feature values, allowing for fast processing. The processing process can be expressed by the following formula: where y l represents the lth value of the output of the layer and x i is the ith feature value of the input [40].

Output
Layer. The output layer in this paper uses a Softmax layer. In multiclassification tasks, the Softmax layer is often used as the output layer of the neural network, which is characterized by the sum of all its outputs being 1 and greater than or equal to 0. It can convert the judgment of the fully connected layer to the category of the sample into a distribution form that can be interpreted as a probability. The specific formula is as follows: The data label corresponding to the Softmax layer is a one-hot vector, and the output in the text is two types of faults [40].
3.1.5. Batch Normalization. Batch normalization (BN) was proposed by Ioffe in 2015. This technique is mainly used in DNN to improve the convergence speed of the model. Since the neural network relies on gradient transfer for training, when the network structure becomes deeper, gradient explosion or gradient disappearance is prone to occur, resulting in the model not converging or the convergence speed being slow. The BN technology normalizes the output of the nonoutput layer of each layer of the neural network to prevent the output of this layer from being too large or too small, and its distribution is always within a range that is conducive to gradient transfer [41].

Training of Convolutional Neural Networks.
In a neural network, the model is not able to complete the required tasks once it is built but needs to be trained with labeled data, so that it can learn the mapping relationship from data to labels, so as to have the ability to complete the task. The training of the model depends on certain evaluation criteria. When it can meet the evaluation criteria, the training will be stopped. This section will introduce the model training techniques. The reverse learning process of the neural network is the process of parameter updating. After the parameters are updated, the output of the network is the expected value. Usually, the cost function is used as the evaluation of the learning degree of the neural network. The value is the sum of the loss function value and the regular term value. The loss function measures the difference between the actual output and the expected output. The loss function is the common cross entropy in multiclassification problems [41]. The formula is wherein x is the softmax vector actually output by the network and y is the one-hot vector of the label. The regular term measures the "deviation degree" of the parameter distribution in the neural network. When the deviation between the network parameters of a certain layer is larger, it means that the output value of the neural network layer is more dependent on the output value of the previous layer, which also makes the entire neural network  Wireless Communications and Mobile Computing easier to overfit. In order to reduce overfitting, the neural network introduces a regular term into the cost function [41]. This paper adopts the L 2 regularization method: where M is the set of parameter matrices for each layer of the neural network and kNk is the Frobenius norm of the matrix. The value of the cost function at this time: where θ is the coefficient of the regular term, which is used to control the effect of the regular term, and is set to 0.6 in this paper.

Improved CNN Model.
In this section, some improvements are proposed for the CNN method, that is, it is proposed to extract some other features of the memristor in the chaotic circuit by FFT, wavelet, and other methods and screen out the features with higher correlation with the fault to construct an artificial feature domain. It is used as the high-level input of the convolutional neural network and is trained by step-by-step training to obtain an artificial feature enhancement model. Since the feature extraction process of the CNN comes from the training set, when the feature distribution of the test set and the training set is quite different, such as the large difference between the current and voltage, the generalization ability is usually not good. The reason is that the features on which the CNN decision-making depends are not necessarily the fault features themselves but may be some accidental factors caused by the fault features under the experimental conditions that generate the training set data. It helps CNN make decisions very well but lacks good generalization. Based on this, this section proposes to use some artificial features as the high-level input of CNN. Artificial features are not automatically extracted by CNN, so the dependence of CNN on training set features can be reduced to a certain extent. These artificial features come from the traditional fault diagnosis theory. After extraction, experts can often diagnose faults based on them. They are also often used in discriminant models such as SVMs. Although artificial features cannot be directly used for fault diagnosis due to the existence of modulation and other phenomena, they tend to generalize better than features extracted by neural networks. Although the theory of CNN has been developed by leaps and bounds, it is still limited by the number of samples, and it is easy to produce overfitting on small batch data. However, other machine learning models trained with artificial features often fail to achieve sufficient accuracy due to insufficient fitting ability and fewer features. One of the ideas is to directly incorporate artificial features into the memristor voltage and current signal data and then extract new features from the CNN model. Affected by the regularization term of the neural network, the neural network tends to be less dependent on a certain part of the features. If it is directly incorporated, the artificial features can have little effect and cannot achieve the purpose of introducing empirical features into the network. However, if the regularization term is canceled, the neural network is prone to overfitting, which runs counter to the purpose of enhancing the generalization ability. Inspired by the current trend of transfer learning, in this section, the filtered artificially expanded feature domain is introduced into the CNN trained in the previous section. While retaining the powerful feature extraction capability of the CNN model, artificial features with better generalization performance are introduced into the neural network performs secondary learning to obtain a model with stronger generalization ability.

Selection of Detection Indicators.
Based on fuzzy theoretical analysis, if the measured voltage U of node A of the memristor test circuit is closer to the nominal voltage U A in the state of fault F within a certain tolerance range, the current state of the test circuit is similar to the state of fault F. The closer the degree is to 1; on the contrary, the similarity rapidly decreases to 0. Each fault of the fuzzy mathematical expression in the fault set F can be represented by a comprehensive function, so the fault diagnosis is based on the measured value u r ðu r ∈ U 1 × U 2 × ⋯ × U m Þ of the test node to determine the current test circuit state. Based on the principle of maximum membership, if F i satisfies Then, u r can be considered to belong to F i and the current state of the test circuit is more similar to the state of the faulty F i .
For fault diagnosis, the key is to determine the membership function. Fuzzy sets that can best represent the fault, functions that are easy to obtain, and consume less computational time are the most suitable membership functions. The Gaussian function is selected as the membership function, such as If u = r, u f ðuÞ = 1, so according to the diagnosis hypothesis, the nominal value under a certain fault is the parameter r.

Wireless Communications and Mobile Computing
According to the invariance of the node voltage sensitivity ratio, assuming that the resistance value of the memristor in the memristor composite test circuit is M, and the test node set is A = fA 0 , A 1 , ⋯, A m g, for any two test nodes points A i and A j ði, j ∈ ½1, mÞ, the relationship between node voltages V i , V j and parameter M are The parameters μ i , ν i , μ j , ν j , λ, δ have nothing to do with M and are only determined by the input excitation and topology of the circuit. Therefore, the junction voltage sensitivity S V i M and S V j M are, respectively: According to formulas (12) and (13), the node voltage sensitivity ratio S This ratio is independent of the memristor parameter M, which is constant when the memristor fails and can be used as a feature for diagnosing the fault state. Therefore, the evaluation indicators selected in this paper are six parame-ters and two node voltages. The specific fault detection index system is shown in Table 1.

CNN Model and Parameters.
This article created a data set of memristor parameter characteristics and node voltage based on chaotic circuits, with a total of 1000 sets of data, to evaluate the CNN model's detection accuracy. In this study, the training set and test set each include four times as many samples as there are in the training set. CNN parameter determination is now based on manual experience rather than a clear theory. The parameters must be constantly adjusted and compared to find an optimal value. There are five convolutional layers, six pooling layers, and three fully connected layers in this section's model. The output dimensions of each layer of the fully connected layer are 80, 40, and 4 in turn. During the learning process of the neural network, all samples are included in the iterations, which greatly increase the computational burden of the computer thereby reducing the convergence speed. Therefore, this section adopts the method of batch learning, and the size of each batch is   6 Wireless Communications and Mobile Computing 100 samples. For the first two layers of the fully connected layer, dropout with a parameter of 0.2 is performed. The first layer of convolution kernel in this paper adopts a large convolution kernel with a length of 101. This is because the data sampling frequency used in this article is high. If the first layer of convolution kernel is small, the receptive field of the convolution kernel contains less information; it is difficult to learn more information. At present, the parameter adjustment of neural network still relies on experience and constant comparison and adjustment for setting. The following is the comparison and selection process of some main parameters of the model: (1) Fully connected layer hyperparameters after the convolutional layer and pooling layer extract features from the data, the fully connected layer is usually used to combine these features, and the output value is finally given. If there are too few neurons in the fully connected layer or the depth is too shallow, it is easy to cause the model to not learn enough features and it is difficult to fit the data. If there are too many neurons or too many layers, it will easily lead to overfitting of the model and reduce the generalization ability of the model. Although the number of fully connected layers and the number of neurons in each layer cannot be tested one by one, it is still possible to select suitable hyperparameters through various attempts. In this paper, several hyperparameter combinations are selected for comparison, and the experimental results obtained are shown in Figure 2. Through the comparison of the test set accuracy of the model under different hyperparameters, it is reasonable to set the fully connected layer to three layers. Among them, the test set has the highest diagnostic accuracy under the hyperpara-meter combination of 80-20-4. The 80-20-4 refers to the output dimensions of each layer of the fully connected layer as 80, 20, and 4 in turn (2) Minibatch parameter: the application of minibatch technology can improve the convergence speed of the model. The parameter batch size is the number of samples used in one training. When the batchsize is too large, it is different from not using minibatch. The technical difference is not great; when this value is too small, it will make the model difficult to converge, resulting in insufficient fitting accuracy. This section selects 7 cases of 16,32,64,128,192, and 256 and not using minibatch technology, and sets a fixed number of 50 epochs to compare the convergence process, see Table 2 for details The comparison results are shown in Table 2. According to the convergence speed of the model, the final batch size value is 128.
(3) Results of the verification: using keras and python3 modules, this study builds and trains the neural network, and the data sets for training and testing are split by 4 : 1. Dropout coefficient is set at 0.3, and the L2 regularization coefficient is set to 0.01 during training. This is a 40-round training session. Figure 3 depicts the diagnostic accuracy of the training and test sets throughout the training procedure Among them, the training set accuracy rate finally reached 0.95, and the test set accuracy rate also converged above 0.9. It can be seen that the model has high accuracy for the diagnosis of memristor faults, but there is obviously room for improvement. Therefore, this paper will be used  During training, the training set accuracy reached 0.98 after the 12th round, and the test set accuracy stabilized above 0.95 after the 12th round. It can be seen that com-pared with the ordinary one-dimensional CNN, the artificial enhancement model not only has a higher accuracy rate in the training set but also has a much higher accuracy rate in the test set. Therefore, it can be proved that the artificially enhanced model does have better generalization performance than the CNN model.

Conclusion
The rapid development of modern electronics has brought great changes to our daily life, but the ongoing development     Wireless Communications and Mobile Computing trend now faces enormous challenges, especially now that there are more and more computing tasks related to processing big data, such as image processing. The emergence of a new type of device, the memristor, has brought a major breakthrough to modern electronic technology due to its many excellent characteristics. Memristors have been widely used in high-density nonvolatile memory, ultrahigh-density Boolean logic, signal processing, and nonlinear circuits. Therefore, researchers are increasingly interested in memristors and their related applications. Fault diagnosis is very important in circuit design and testing. In order to improve the accuracy and effectiveness of memristors in various applications, the method combining neural network and fuzzy analysis is used for the faults of memristors in chaotic circuits. Diagnosis can accurately determine whether the memristive device in the circuit is faulty. The work of this paper is as follows: (1) The development status of memristors and the detection methods related to memristor faults are introduced in detail. This paper introduces and summarizes the development of machine learning, especially neural networks, and the citations of related theories in memristor fault detection, and determines the application methods and research directions of the paper. (2) The related theory of CNN is introduced, and the training skills of CNN used in this article are explained. And it proposed to use onedimensional CNN for fault diagnosis, analyzed some shortcomings, and proposed an improved method, namely artificial feature enhancement model. (3) The memristor fault detection data set is designed by itself, and the validity of the structure and parameters of the selected CNN is verified. The model was tested for various hyperparameter combinations and 80-20-4 combination yielded the highest accuracy. Finally, through the comparison of the CNN model and the artificial feature enhancement model, the effectiveness of the improvement and the superiority of the improved model in fault diagnosis are verified. Although the results seem quite promising, but the model was not evaluated against the other predominant algorithms. As part of future studies, the model could be further compared against the other state of the art models to justify its superiority.

Data Availability
The data sets used during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
The author declares that he has no conflict of interest.