A Hybrid Convolutional Neural Network and Support Vector Machine-Based Credit Card Fraud Detection Model

. Credit card fraud is a common occurrence in today’s society because the majority of us use credit cards as a form of payment more frequently. Tis is the outcome of developments in technology and an increase in online transactions, which have given rise to frauds that have caused signifcant fnancial losses. In order to detect fraud in credit card transactions, efcient and efective approaches are needed. In this study, we developed a hybrid CNN-SVM model for detecting fraud in credit card transactions. Te efectiveness of our suggested hybrid CNN-SVM model for detecting fraud in credit card transactions was tested using real-world public credit card transaction data. Te architecture of our hybrid CNN-SVM model was developed by replacing the fnal output layer of the CNN model with an SVM classifer. Te frst classifer is a fully connected layer with softmax that is trained using an end-to-end approach, whereas the second classifer is a support vector machine that is piled on top by deleting the fnal fully connected and softmax layer. According to experimental results, our hybrid CNN-SVM model produced classifcation performances with accuracy, precision, recall, F 1-score, and AUC of 91.08%, 90.50%, 90.34%, 90.41, and 91.05%, respectively.


Introduction
A credit card is a tiny, thin plastic or fber card with personal information such as a picture or signature that allows the person whose name is on it to charge goods and services to the connected account, with regular debits being made from that account [1].It is a fnancial product provided by banks with a predetermined credit limit that enables you to conduct purchases without using cash.Your credit score, credit history, and income are used by the card issuers to establish your credit limit.Customers who borrow money using credit cards are required to pay it back in full by the billing date or over time, together with any relevant interest and other costs.
In recent years, as technology has advanced, the majority of them are purchasing their requirements using a credit card [2,3].Services such as e-commerce, tap-and-pay systems, and online bill payment systems have proliferated and become more widely used [2,4].
Despite the many advantages that credit cards ofer users, they are also linked to issues such as fraud and security [5].For any bank in the globe, the security of card payments and the confdence of customers using cards to make purchases are serious concerns [6].According to a number of reports, attempts to detect credit card fraud have increased over time [7,8].Tis is why credit card fraud is reportedly a problem for banks and other fnancial institutions.
Banks and other fnancial organizations use machine learning mostly to improve their ability to detect fraudulent transactions.However, there are a variety of reasons why machine learning may have trouble detecting fraud [9][10][11]: Te distribution of the data is particularly imbalanced due to the small number of fraudulent transactions, the frequent altering of the data over time, and the lack of real-world datasets due to privacy issues.
Numerous approaches were proposed in the literature in an efort to overcome these issues, but no hybrid convolutional neural network-support vector machine has been found based on the authors' ability to access the techniques used in credit card fraud detection.
Tis study integrates deep learning with machine learning algorithms to predict the detection of credit card fraud in accordance with the body of existing literature.We proposed CNN hybridized with the support vector machine due to the nature of the support vector machine efectiveness on datasets with multiple features, efective in cases where the number of features is greater than the number of data points; it uses a subset of training points in the decision function called support vectors which makes it memory efcient; diferent kernel functions can be specifed for the decision function, in order to predict a credit card fraud detection.

Related Works
More substantial study attempts have been made in the literature due to the potential economic value of recognizing and classifying credit card fraud detection.Tis section reviews a number of signifcant studies.
Fraudulent credit card transactions can cause signifcant fnancial losses, especially when they have a high value.In order to prevent fraudulent transactions from being approved by card issuers, it is essential to spot them.Te majority of conventional techniques for detecting fraud are built on machine learning models.Numerous studies looked into how deep learning models could be used to accurately spot fraudulent transactions.Yet these investigations only take into account one deep learning model.Reference [12] shows a variety of deep learning and ensembles algorithms for identifying fraudulent credit card transactions.Teir work's primary aim is to reduce missed frauds and false alarms, and they especially contribute to contribute the work by merging the results of three diferent deep learning models, namely, convolutional neural networks, autoencoders, and recurrent neural networks.Both the owners of credit cards and fnancial institutions sufer large fnancial losses as a result of credit card theft.Reference [13] developed a credit fraud detection model using state-of-the-art of machine learning and deep learning algorithms.Teir primary goal is to identify state-of-the-art deep learning algorithms frauds, which includes the availability of public data, data with large class imbalances, changes in the form of fraud, and a high rate of false alarms.
Tey considered several machine learning-based strategies for credit cards including the extreme learning method, decision tree, random forest, support vector machine, logistic regression, XG Boost, and modern deep learning algorithms.However, due to the low accuracy,they apply state of the art deep learning algorithms to reduce fraudlosses.Te recent advancement of deep learning algorithms has been the main area of their focus.Te dataset was frst subjected to a machine learning technique, which somewhat increased the accuracy of fraud detection.Subsequently, three convolutional neural network-based designs are used to boost the efectiveness of fraud detection.Te model they suggest comprises 14 layers, starting with a convolutional layer with a kernel size of 32 × 2 and a ReLU activation function, then a batch normalization layer, and fnally a dropout layer with a dropout rate of 0.2.Ten, a batch normalization layer, a dropout layer, and a convolutional layer with a kernel size of 64 × 2 and a ReLU activation function are added.After that, they add a fattening layer with a kernel size of 64 × 2 and a ReLU activation function and then three dense layers, a dense layer, and a dropout layer with a dropout rate of 0.5.By adding more layers, they improved the precision of detection, and by varying the number of hidden layers, epochs, and applying the newest models, a thorough empirical investigation has been conducted.Te evaluation of their research efort demonstrates the enhanced outcomes obtained, with optimum values for accuracy, F1-score, precision, and AUC curves of 99.9%, 85.71%, 93%, and 98%, respectively.For problems involving credit card detection, their suggested model performs better than the state-ofthe-art machine learning and deep learning algorithms.
Taha and Malebary [14] provide an intelligent method for identifying fraud in credit card transactions utilizing an improved light gradient enhancing device (OLightGBM).Te suggested method carefully mixes a Bayesian-based hyperparameter optimization method to change a light gradient boosting machines parameters (LightGBM).Te 5fold cross-validation method is utilized to test the efectiveness of the proposed method for identifying credit card fraud using two real-world data sets.Te Bayesian-based hyperparameter optimization algorithm is used to train the parameters for the suggested approach.Many metrics are taken into account, including precision, recall, accuracy, AUC, and F1-score, to evaluate the performance of the suggested technique.Te suggested approach performed better than other methods when used with the two data sets, according to their experimental results, in terms of accuracy (98.40%), the area under the receiver operating characteristic curve (AUC) (92.88%), precision (97.34%), and F1score (56.95%).
Some other authors trained four prediction models, including the random forest (RF), the gradient boosting machine, and the artifcial neural network (ANN) [15].Te class imbalance issue in the data was a signifcant difculty for them when they developed their fraud detection systems because actual transactions exceed fraudulent ones and frequently make up less than 1% of all transactions.In order to overcome this challenge, they used the synthetic minority oversampling technique (SMOTE), random under-sampling Te precision score for this classifer was observed to be 0.86.Te stacked ensemble algorithm, which had the best average performance at 0.78, was trained using every sampled dataset.In the end, they come to the conclusion that the stacked ensemble model has shown promise in the detection of fraudulent transactions across the majority of sampling techniques.Special attention is also given to develop a model for detecting credit card fraud using a hybrid AdaBoost and majority voting techniques [16].Te empirical evaluation has made use of a number of common models, including NB, SVM, and DL.An openly accessible credit card data set was utilized to evaluate both individual (standard) models and hybrid models that combined the AdaBoost and majority voting approaches.As it considers both true and erroneous positive and negative expected outcomes, the MCC metric has been used as a performance indicator.Te best MCC score, obtained through majority voting, is 0.823.Tey have employed the same individual and hybrid models.AdaBoost and majority voting techniques have been used to attain a fawless MCC score of 1. Noise ranging from 10% to 30% has been introduced to the data samples to help further analyze the hybrid models.For a 30% increase in noise, the majority voting technique produced the best MCC score of 0.942.Tis demonstrates that the majority voting method performs well even when there is noise.
For feature selection, a special group of authors [4] developed a machine learning-based method for identifying credit card fraud utilizing the genetic algorithm.Te suggested detection engine employs the machine learning classifers decision tree (DT), random forest (RF), logistic regression (LR), artifcial neural network (ANN), and Naive Bayes (NB) after selecting the optimal attributes.A dataset compiled by European cardholders is used to assess the performance of the proposed credit card fraud detection engine.Te outcomes showed that their suggested method outperformed already-existing ones.Te GA-ANN, which had a 100% accuracy and an AUC of 0.94, served as a backup for the GA-DT, which attained a 100% accuracy and an AUC of 1. Teir fndings demonstrated that GA-RF attained the highest possible degree of general accuracy.
According to the potential of the authors, there is no work in detecting credit card fraud using a hybrid CNN-SVM algorithm.Terefore, in this study, we develop a credit card fraud detection model based on a hybrid CNN-SVM.

. Methods
Te CNN and the SVM classifers were integrated into our suggested model design.In Section 3.1, we discuss about the nature of the dataset, in Section 3.2, we give a quick overview of the 1D-CNN theory, and in Section 3.3, we discuss the SVM structure.Te hybrid CNN-SVM trainable feature extractor model will then be introduced in Section 3.4, and fnally the model performance evaluation metrics are discussed in Section 3.5.

Dataset.
We employ a secondary dataset made up of transaction information from European credit card users (https://www.kaggle.com)for the purposes of this study.Tis dataset has a total of 284,809 transactions.In columns V1-V28, time and amount are among the 30 features in the dataset.Personal information and other elements that may include sensitive data are covered by the obfuscated columns V1-V28.Te target variable, which consists of two classes, is shown in the fnal column.A fraudulent transaction represents the frst class and has a value of 1, while a nonfraudulent transaction has a value of 0. Only 0.2432 percent of the transactions in the dataset were fraudulent;hence, models can only reliably predict legitimate transactions in the future while completely failing to do so when attempting to predict fraudulent transactions.Tus, when training the models, it is crucial to maintain a balance between the classes in the dataset.So, to overcome this gap, it is mandatory to use sampling technique.
Resampling the data is one of the approaches that are most frequently used to rectify an unbalanced dataset.Undersampling and oversampling are the two main sorts of ways for doing this.Oversampling approaches are typically favored over undersampling ones.Because of this, when we undersample data, we frequently leave out cases that could contain crucial information.In this study, we use synthetic minority oversampling technique (SMOTE).
Te SMOTE selects a point at random from the minority class and calculates its K-nearest neighbors for this point [17,18].Between the selected point and its neighbors, the artifcial points are inserted.Te following easy steps make up the SVMOTE oversampling process [19,20]: Mathematical Problems in Engineering (i) Performs the following for each pattern x 0 in the minority class, that is, from the fraudulent class of credit card transactions: (1) Choose any x of its K-nearest neighbors (belonging to the fraudulent class also) (2) Create a new pattern x ′ as shown below on the line segment joining the pattern and the selected neighbor: where λ is a uniform random variable in the range [0, 1]

Convolutional Neural Network.
A convolutional neural network (CNN) is a unique type of feedforward neural network that makes use of pooling, convolution, and ReLU layers [21,22].A CNN is a multilayer artifcial neural network with a deep supervised learning architecture that primarily contains four layers.Te components of 1D-CNN are convolution layers, pooling layers, dropout layers, and activation functions for processing the onedimensional data.

Convolutional Layer.
A convolutional layer is an essential part in building a convolutional neural network [21].It is fundamental for applying a flter to an input.A feature map, which shows the specifc properties connected to the data points, is produced by using the fltering procedure repeatedly.In a linear procedure known as convolution, a set of weight is used to contain the multiplication of inputs.Te single-dimensional array weights, referred to as the kernel, are multiplied by the inputs in this instance.Tis procedure produces a diferent value for each pass, and when it is carried out, it produces a feature map, which is a collection of values.

Pooling Layer.
Once a feature has been identifed, its exact placement is less important [23].Consequently, the pooling or subsampling layer comes after the convolution layer [24].Pooling techniques can be thought of as downsampling operations that strive to reduce the number of parameters while maintaining the most important qualities in order to speed up the subsequent computing phase.Te pooling step will also address the overftting problem.Even though CNNs can use a variety of pooling approaches, maxpooling is the most used one.Utilizing the pooling strategy has the dual benefts of signifcantly reducing the number of trainable parameters and introducing translation invariance [25].A window is chosen, and the input items included in that window are sent through a pooling function to carry out a pooling operation [26].
Input is split into 1D pooling regions, as shown in Figure 1, and a 1D max-pooling layer downsamples by fnding the maximum of each region.

Dropout.
Overftting, which is loosely defned as the problem of memorizing the inputs instead of learning their general characteristics, is avoided by using dropouts.It makes use of randomness during the training process.Te weights are for the entire problem rather than being tuned for data noise.If not used correctly, it could lead to slow training or a failure to recognize trends.

Activation Function.
ReLU is a linear activation function that outputs the same input and makes it zero in the absence of a negative input [27].Te ReLU activation function solves the vanishing gradient problem, improves the model performance, and speeds up learning from the training data.
3.3.Support Vector Machine.Support vector machine (SVM) is one of the relatively recent and promising techniques for learning separate functions in pattern recognition (classifcation) tasks or for performing function estimates in regression issues [28].Instead of ofering a regression model and an algorithm, support vector machines ofer a classifcation learning model and an algorithm [29].Te goal of employing SVM is to identify a classifcation criterion (i.e., a decision function) that, at the testing stage, can accurately classify unknown data with good generalization [30].A training set is said to be linearly separable if a linear discriminant function exists with a sign that corresponds to the class of each training example.If a training set can be linearly separated, there are typically an infnite number of separating hyperplanes, choosing a separation hyperplane that maximizes the margin or one that leaves the most distance between it and the nearest example [31].
We consider a set of data points made up of n vectors x i , each of which is linked to a value y i that indicates whether the element belongs to the fraud class (+1) or not-fraud class (−1).
A linear hyperplane for a set of training data, x i , for i � 1, 2, . . ., n, is defned as where w is an n-dimensional vector and b is a bias term.Te optimal hyperplane is required to satisfy the following constrained minimization as shown [32]: ( In these cases, the constraint can be enforced using a Lagrange multiplier (α), shown as follows: To fnd the minimum of (3) over w, b (while fxing all α i ), we set the gradient vector to zero as follows: Te cost function's solution, which yields the maximum hyperplane utilized to categorize the two classes of the credit card transactions as nonfraudulent and fraudulent, is as follows: In the absence of linear separability in the training set, optimal hyperplanes are inefective in classifying credit card fraud detection.So, to counter this problem, a soft-margin is introduced [33].Slack variables are introduced to permit the disregard of some constraints.To put it another way, the margin will ofer some training points.Terefore, we want their margin penetration to be as slight as it is practical.Te margin should only be made up of the fewest feasible points.
where ξ is a slack variable and C is the penalty parameter of the error term.Te solution of ( 6) is Higher values of C produce lower biases and higher variances, while lower values of C provide opposite efects as follows: higher biases and higher variances.It is necessary to determine the ideal C value for the trade-of between bias and variance.

Hybrid CNN-SVM.
Convolutional neural networks are employed by both support vector machines and fully connected networks because of their capacity to automatically extract features.Te main characteristics of both classifers are combined in the suggested hybrid model.Support vector machines function as classifers [34], while the hierarchical structure of CNN, a successful deep-learning model, allows for the extraction and training of high-quality features at each layer [34,35].SVMs are capable of outperforming convolutional neural network in terms of classifcation performance [36].As a result, before the support vector machine-based classifcation, there is no requirement for an additional feature extraction or selection stage [37].
Our hybrid CNN-SVM model's architecture was created by substituting an SVM classifer for the last output layer of the CNN model [38,39].Te characteristics obtained from the CNN layers are sent into the SVM classifers after being reorganized and going through the frst fully connected layer.As a result, support vector machines were utilized after fully connected networks with a softmax function.
Te proposed method architecture has two classifers, as shown in Figure 2. Te initial classifer, trained using an endto-end case, consists of a fully connected layer with softmax.After the fnal fully connected and softmax layer has been removed, a binary SVM classifer is added on top of it.Te credit card fraud data are separated from the attributes of the transaction information by the convolutional layer, and the pooling layers reduce the size of the feature maps.As a result, the pooling layer lowers the amount of computation done within the network and the number of parameters that need to be generated from the credit card fraud data.Te lending data's feature maps, which were created by repeatedly applying a number of convolutional and pooling layers, are fattened into a one-dimensional array and used as inputs for support vector machine algorithms.
Te architecture consists of three convolutional layers (Conv1D) with 256, 128, and 64 flters each, a two-kernel size, a ReLU activation function, a dropout layer, pooling layers, and three dense layers.In order to prevent the model from overftting, we implemented a dropout with a 0.5 rate that disables 50% of the neurons during training after three successive convolution layers.Te max-pooling layer lowers the computational cost of the model by reducing the number of parameters that must be learned.A fattening layer is then applied to the results, reducing the three-dimensional vector to a single dimension.Next, three dense layers with ReLU and SoftMax activation functions were used to predict a target variable.

3.5.
Model Performance Evaluation Techniques.Depending on the aim of the experiment we are doing, we can use a variety of statistical metrics to evaluate binary classifcations [40].Te most important measures for binary classifcation issues are accuracy and F1-score.Te accuracy of a model is one way to measure the number of correctly predicted data points out of all the data points [41][42][43][44].Accuracy is efective when the target class is well-balanced, but it is not a suitable option for courses that are out of balance.
Mathematical Problems in Engineering Four types of outcomes are possible [45] while evaluating a credit card fraud detection and prediction model performance.
(i) True positives (TP): In some cases, the model's predictions come true.Tis is the case where a customer is a nonfraudulent and the model also predicts in the same way.(ii) True negatives (TN): It is when the consumer does not have nonfraudulent and the model does not predict it; in other words, when the model says nothing has happened and nothing has really occurred.(iii) False positives (FP): In situations when the algorithm claims that something has happened when it has not, the model predicts fraud even while the customer does not have nonfraudulent data.Errors of this kind are classifed as Type I errors.(iv) False negatives (FN): Tere are instances where the model does not give any guarantees since the event has already occurred, such as when a customer is the target of fraud but the model did not foresee it.Tis is a very serious error and it is called a Type II error.
Precision is the proportion of accurately predicted positive outcomes out of all predicted positive outcomes.It can be expressed as the proportion of true positives (TP) to the total of true and false positives (TP + FP).Precision is helpful when false positives are more problematic than false negatives.
Recall, also called as sensitivity, is a proportion of accurately predicted positive outcomes among all actual positives outcome.Te ratio of true positive (TP) to the sum of true positives and false negatives (TP + FN) helps to identify the proportion of correctly predicted actual positives.
Recall and precision are combined into one performance indicator called F 1 − score.It is the harmonic mean between precision and sensitivity [46].Te weighted average of precision and recall is the F 1 − score.Terefore, both false positives and false negatives are considered while calculating this score.F 1 typically outperforms accuracy, especially when there is an uneven distribution of classes.
Te receiver operator characteristic (ROC), also known as AUC-ROC, is a probability curve that contrasts the true positive rate (TPR) and the false positive rate (FPR) at diferent threshold values and distinguishes the "credit card" from the "fraudulent."Te AUC measures a classifer's ability to distinguish between fraudulent and nonfraudulent classes.Te ability of the classifers to distinguish between the target classes improves with increasing AUC values [47].

Results and Discussion
We carry out two experiments.A convolutional neural network served as a feature extractor in the frst experiment while a fully connected network with softmax served as a classifer.Te Adam optimizer, ReLU activation function, max-pooling, and dropout � 0.5 are used during 15 training epochs before the network produces the desired results.We utilize synthetic minority oversampling technique (SMOTE) due to the highly unbalanced structure of the dataset.Te second test used a support vector machine classifer.Te output of the late CNN layers from the frst experimental setup is used as a feature in this experiment to train a second support vector machine.
Te support vector machine with C � 100, gamma � 0.9, and radial basis function kernel stacked together at the top of fatten layer achieved an accuracy of 91.08%.We can therefore conclude that our model is quite efective in predicting the detection of fraud.However, this is incorrect 6 Mathematical Problems in Engineering due to the nature of the dataset which is imbalanced.When evaluating predictive performance in the context of an unbalanced dataset, accuracy is a poor criterion to use.We must thus consider additional performance criteria that ofer a better direction for model evaluation.Terefore, precision, recall, F1-score, and AUC are helpful measures to assess how well the fraud detection model is working.
Table 1 shows the prediction performance of our proposed method.Te precision score, which measures how efectively fraudulent transactions are distinguished from nonfraudulent transactions, gave the proposed method a score of 90.50% as well.Our hybrid model achieved a recall of 90.34%, which is the proportion of positive samples that were correctly identifed as positive in comparison to all positive samples.As more positive samples are discovered, recall, a measure of how well the model can identify positive samples, increases.Te F1-score, which goes from 0 to 1 with 1 being the best, can be used to evaluate the overall performance of a model.More specifcally, the F1-score may be considered as the model's capacity to balance its capacity to gather positive cases (recall) and be accurate with the cases it does capture (precision), and our model achieved 90.41%.Te F1-score is a representation of the equilibrium between recall and precision.As a result, both false positives and false negatives are taken into account in this score.Te AUC values it attained, which were 91.05%, show that the suggested approach is capable of telling valid credit card transactions apart from fraudulent ones.
Te prediction capability of our proposed method based on various kernels is displayed in Table 2. Radial basis function (RBF), sigmoid, polynomial, and linear are the kernels.So, based on the fndings of the experiments, polynomial and sigmoid kernels perform poorly when it comes to predicting credit card fraud.Te maximum accuracy that we have achieved with radial basis function and a linear kernel is 91.08% and 90.62%, respectively, with C � 100 and gamma � 0.9.Terefore, we conclude that when we are using a radial basis function and linear kernel, our proposed technique performs admirably in terms of predicting credit card fraud.
Te hybrid CNN-SVM model's performance, as well as a comparison of the hybrid CNN-SVM model's performance to the CNN model, is shown in Figures 3(a

Mathematical Problems in Engineering
According to the experimental results, our proposed CNN-SVM algorithms exceeds the fully connected convolutional neural network in its ability to classify fraud detection as fraudulent or nonfraudulent with precision, recall, F1-score, and the area under the curve (AUC) of 90.50%, 90.34%, 90.41%, and 91.05%, respectively.
For better visualization of the performance of the proposed method, AUC-ROC curves-performance indicators for classifcation problems with diferent threshold settings and a precision-recall curve with a number of threshold values are summarized as a single score, as shown in Figures 4(a

Conclusions
Credit card fraud is a common occurrence in the modern world because the majority of us use credit cards as a form of payment regularly.Tis is the result of technological advancements and a growth in online transactions, which have led to frauds that have resulted in signifcant losses.To fnd fraud in credit card transactions, efcient and efective methods are needed.
In this study, we create a hybrid CNN-SVM model to identify fraudulent credit card transactions.We assessed the efectiveness of our proposed hybrid CNN-SVM model for detecting credit card fraud using real-world, openly accessible credit card transaction data.
Te dataset from https//:www.kaggle.comhas 284,809 transactions in total.Te dataset has V1, . .., V28, time, and amount among its 30 features.Personal data and other elements that can include sensitive data are covered in the obfuscated columns V1-V28.Te target variable, which consists of fraudulent and nonfraudulent transactions, is quite imbalanced.As a result, models can only correctly predict future valid transaction; they are completely incapable of doing so when predicting future fraudulent transactions.So, when training the models, it is crucial to maintain a balance between the classes in the dataset.Hence, we applied the synthetic minority oversampling technique (SMOTE) to overcome this challenge.Te SMOTE randomly chooses a point from the minority class and determines this point's K-nearest neighbors.Te artifcial points are placed between the chosen point and its neighbor.
Our hybrid CNN-SVM model was designed by replacing the fnal output layer of the CNN model with an SVM classifer.Te frst classifer employs a fully connected layer with softmax and is trained end-to-end.Te second classifer, a support vector machine, can be stacked on top of the frst classifer by removing the last fully connected and softmax layer.Te credit card fraud data are separated from the attributes of the transaction information by the convolutional layer, and the pooling layers reduce the size of the feature maps.As a result, the pooling layer lowers the amount of computation done within the network and the number of parameters that need to be generated from the credit card fraud data.Te lending data's feature maps, which were created by repeatedly applying a number of convolutional and pooling layers, are fattened into a onedimensional array and used as inputs for support vector machine algorithms.
Te experimental results show that our proposed method outperforms the fully connected convolutional neural network in terms of precision, recall, F1-score, and AUC in predicting whether credit card fraud detection is fraudulent or not.
In general, we draw the conclusion that utilizing a hybridized CNN with SVM to predict the detection of credit card fraud is preferable to employing a fully connected convolutional neural network with softmax.
In our future work, we will develop a spatio-temporal model with hybrid machine learning algorithms to detect credit card fraud and compare various hybrid machine learning techniques.

Figure 2 :
Figure 2: Architecture of the model.

Figure 3 :
Figure 3: ROC and precision-recall curve.(a) Performance metrics are evaluated by comparing a hybrid CNN-SVM with four diferent kernels, namely, linear, sigmoid, polynomial, and radial basis function.(b) Comparison of hybrid CNN-SVM and CNN using various performance metrics.

Figure 4 :
Figure 4: ROC and precision-recall curve.(a) ROC curve: AUC-ROC curves are performance indicators for classifcation problems with various threshold settings.(b) Precision-recall curve: the curve with a variety of threshold values is summarized as a single score by the precision-recall curve.
removed by default by the model, and 0.005 learning rate was chosen by the model.Tere were 43 trees in the distributed random forest, with a maximum tree depth of 20.Te lack of overftting is achieved by the minimal tree depth, which reduces model complexity.A minimum of 5 observations must be made for each leaf, according to the min rows option, which has the value 5 specifed.Te boosting machine contained 116 trees at the maximum depth of 15 trees.Overftting can be prevented, which will reduce model complexity.Each tree will take up 80% of the columns because the minimum number of rows to sample for each tree was set at 100 and the column sampling rate was set at 0.8.Te fndings of this experiment indicate that SMOTE-based sampling techniques will perform well.Te best recall score, which was 0.81, was achieved by the SMOTE sampling method utilizing the DRF classifer.