Heston-GA Hybrid Option Pricing Model Based on ResNet50

(1) Background. is study aims to improve the accuracy of the pricing model. (2) Methods. Heston model is combined with ResNet50 convolutional neural network model. Based on the optimization of Heston model parameters by genetic algorithm (GA), ResNet50 model is used to correct the deviation between market option price and Heston price, so a new hybrid option pricing model is established based on the empirical research on the European call options of Huatai-PB CSI 300ETF (code 510300), Harvest CSI 300ETF (code 159919), and SSE 50ETF (code 510050). (3) Results. e pricing result of the hybrid model is better than other single models and hybridmodels.emodel is applicable to the pricing of options with short and long remaining terms. (4) Conclusions. It is shown that the combination of Heston model and ResNet50 model with optimized parameters can ensure the interpretability of the model and enhance the nonlinear tting ability of the model, which conrms the eectiveness of the hybrid model and provides a reference for investors and institutions to make scientic decisions.


Introduction
In 1973, Black, Scholes, and Merton made a major breakthrough in European option pricing and developed the classic Black-Scholes-Merton model (BSM model) [1,2]. Aiming at the assumption that the volatility of BSM model is constant, Stein et al. [3] proposed Stein-Stein stochastic volatility model by introducing the stochastic process of the underlying asset price volatility, where the asset price and volatility follow their respective stochastic processes. In 1993, Heston [4] applied the mean-reversion square root process to address the problem of negative volatility values in Stein-Stein model. Heston model has been widely used and studied because of the advantage of portraying a reasonable implied-volatility shape and the existence of the closed-form solution [5][6][7]. Up to now, the optimization of Heston model from the mathematical perspective mainly involves ve directions: considering the randomness of the constant terms [8][9][10][11][12], improving the geometric Brownian motion [13,14], adding a jump di usion process [15,16], changing the power of the variance [17,18], and considering the roughness of the volatility [19,20]. e mathematical enhancement endowed Heston models with more rigorous tting to real market dynamics, but two problems remain: the lack of robustness due to insu cient consideration of the factors in uencing the underlying asset prices and the lack of practical usability due to the complexity of the model structure. e above BSM-based models are classi ed as parametric models, while the neural network (NN) option pricing models are called nonparametric models. NN is a nonlinear model driven by market data, whose structure and parameters are determined from the acquired nancial sample data without any assumption. is property makes NN models more suitable for option pricing with unknown underlying asset price dynamics and the pricing equation without analytical solution. e early application of NNs in option pricing can be traced back to 1993, when the NN model used by Malliaris et al. [21] successfully outperformed BSM model. Mitra [22] improved NN's pricing performance from the perspective of structure optimization. According to BSM, the input layer of the NN model was designed as 5 variables in BSM solution, the hidden layer was the two normal distribution values of BSM, and the output layer was the option price. Yacin et al. [23] considered volatility as a random variable and input it into the NN model. Results showed that the NN model was more accurate than Heston model. In addition, many other researchers have extensively studied the application of NNs in options pricing [24][25][26][27][28]. However, due to 'black box' problem, constructing the option price NN model still lacks rigorous mathematical proof, and the constructed model can hardly be explained by traditional financial theories.
Maciej et al. [29] found that NN models cannot adapt to various market conditions; i.e., they are not robust, and they proposed an idea to incorporate the NN model with BSM model or Heston model. Combining the financial model and the NN model to build a hybrid model can effectively overcome the drawbacks of both, improving the nonlinear fitting ability while enhancing the interpretability. Arin et al. [30] used a deep multilayer perceptron to fit BSM option price deviations where the ratio of strike price to stock price exceeds a certain threshold and thus built a hybrid deep neural network model. Hideharu [31] combined the advantages of AE (asymptotic expansion) and NN, used NN to train the residual between option price C and its asymptotic approximation C to improve stability and approximation accuracy, and used the hybrid model for option pricing in both local and stochastic volatility models. Zhang et al. [32] replaced BSM model in the traditional hybrid NN models with Heston stochastic volatility model, and Yang et al. [33] modified the pricing bias of Heston model using a nested long short-term memory neural network (NLSTM). eir results are all better than single models. Inspired by the above hybrid BSM-NNs model, we optimize Heston model from outside rather than inside, i.e., using an NN model to correct the nonlinear deviation of Heston prices from market prices.
Both classical option pricing models and NN models have their own flaws: (1) the classical parametric option pricing model must follow strict assumptions, which will result in systematic deviation under large difference between the initial condition and the real market. (2) e NN models have some problems such as insufficient fitness for discontinuous trading. So, we explore a hybrid model which can avoid their weaknesses and make full use of their strengths. e empirical results also show that hybrid NNs models have quite adaptability to ETF options pricing. In fact, the option price, which is affected by many factors, is not unary time series. At present, the term of ETF option contracts in China is up to six months, and there are several option contracts which have different maturities and strike prices in the market at the same time.
erefore, it is necessary to comprehensively consider multiple option contracts for its pricing. Based on the above conclusion, we choose a convolutional neural network (CNN) model, i.e., construct a hybrid CNN model based on Heston model for option pricing. In our research, ResNet50 CNN model is used to correct the pricing bias between the Heston price and the market price. Our hybrid model gives full play to the precision pricing performance of classical option model and nonlinear fitting performance of deep NNs model. Furthermore, our hybrid model is used to carry out empirical research on Huatai-PB CSI 300ETF option, Harvest SCI 300ETF option, and SSE 50ETT option. Experimental results illustrate the effectiveness and high performance of our hybrid model.

Theoretical Model of Option Pricing
e hybrid CNN model in this paper includes both Heston option pricing model and ResNet50 CNN model. Both are described separately, followed by an explanation of the steps in genetic algorithm (GA) for estimating the parameters of Heston model, and finally there is an explanation of the overall model construction process.

Heston
Model. Traditional BSM model assumes that the volatility of underlying asset price is constant, but, in real financial market, volatility can change and there is a "volatility smile" phenomenon. For this reason, Heston used the square root mean-reversion process [4] to describe the realtime volatility variance, with the underlying asset price and the volatility movement according to the following differential equation: where S t and V t represent, respectively, the underlying asset price and the variance of the asset price at time t and the parameter μ represents the mean value of the asset price. V t t ≥ 0 follows a square root process. e parameter θ corresponds to the long-run average of V t and κ is the meanreversion speed of the volatility. e parameter ρ is the correlation between the underlying asset and the volatility while σ is the volatility of the variance of returns. Also, W 1 t and W 2 t are two standard Brownian motions.

Theorem 1.
Assume that the underlying asset price S t and its volatility V t obey (1) at time t(0 ≤ t ≤ T); then the European call option price C(S t , V t , t) satisfies the following partial differential equation (PDE), denoting S t and V t by S and V for simplified notation here: where the parameter r is the risk-free interest rate and λ(S t , V t , t) is volatility risk premium. And, at time T, the PDE (2) satisfies the following boundary conditions: Theorem 2. Based on the PDE of (3), the price C(S t , V t , t) of a nonpaying dividend European call option with expiry date T and strike price K has the following closed-form solution: 2 Discrete Dynamics in Nature and Society where P 1 and P 2 represent the probability distribution function. e mathematical expression is shown as follows: where j � 1, 2, f 1 , f 2 represent the characteristic functions of P 1 , P 2 . Re(y) represents the real part of y, i is an imaginary unit, and ∅ is the integral variable. e expressions for each part of (5) are shown as follows: x � ln S t , where j � 1, 2, a � κθ, u 1 � 1/2, In addition, since volatility is not a tradable commodity in the market, the volatility risk premium is an unobservable measure. Heston assumes that λ(S t , V t , t) � kV t , which is proportional to volatility, is independent of investors' risk attitudes. So, λ can be eliminated by equivalent martingale measure and volatility risk price λ � 0 in a risk-neutral world. en, the above parameters can be expressed as e parameters to be estimated of Heston model also change from κ, θ, σ, V 0 , ρ, λ to κ * , θ * , σ, V 0 , ρ .

Convolutional Neural Network Model: ResNet50.
Convolutional neural network is a deep neural network model of multilayer supervised learning, with the convolutional layer and the sampling layer being the core modules that implement feature extraction. e convolutional layer performs local connectivity and weight sharing between neurons for feature extraction, and the sampling layer extracts the most representative features to achieve dimensionality reduction. Since Fukushima [34] first proposed convolutional neural network (CNN) in 1980, CNN has been widely used in machine learning tasks such as image recognition, face verification, speech recognition, and text classification. Classical models include LeNet, AlexNet, VGG, ResNet, and DenseNet. CNN, as the most representative network model in deep learning, was initially applied to image recognition, while option data has similar characteristics to image data, such as high dimensionality and local clustering, that is, multiple influencing factors, selfsimilarity, and memory of its own price movement process. Additionally, with the deepening of neural network layers, the accuracy of the model can be effectively improved, but the gradient disappearance and network degradation problems will also arise. Taking all factors into consideration, ResNet50 model is chosen in this paper to correct the deviation caused by Heston model.
ResNet model is a residual network with modular thinking [35]. e key idea is shortcut connection; that is, the input of the residual module is directly added to the output of the module through identity mapping. e basic structure of the residual module is shown in Figure 1, and its mathematical expression is where x is the input of the residual module, W 1 and W 2 are the weight matrix of the two-layer convolution layer, σ is the Discrete Dynamics in Nature and Society ReLu nonlinear activation function, and yis the output of the residual module. ResNet50 consists of 16 residual modules, each of which has three convolutional layers. Since the dimensions of input and output are inconsistent in some ResNet50 residual modules, it is necessary to change the dimensions of input and then add it to the output, usually by adding a convolution process. To facilitate differentiation, the module whose input is mapped through the identity mapping is called Identity Block, and the module whose input is mapped through the convolution layer is called Conv Block. e structure diagrams of Identity Block and Conv Block of ResNet50 are shown in Figure 2, where batch norm represents the batch normalization layer and 1 * 1 and 3 * 3 represent the size of convolution kernel.
ResNet50 original model was made up of 49 convolutional layers and 1 fully connected layer (excluding pooling layer, batch normalization layer, and activation function layer). When He et al. [35] proposed this model, colour images with size of 224 * 224 pixels were taken as input data. After feature extraction by one convolutional layer and one pooling layer, the image was flattened by entering a 16residual module convolution operation and finally outputting a 1000-dimensional vector through a fully connected layer. erefore, when ResNet50 model is applied to option pricing, it is necessary to construct data input in the form of images, that is, two-dimensional (k � 1) or three-dimensional (k ≥ 2) data form of m * n * k. ese data can be composed of several options with certain characteristics (such as the same expiration month). Specific data include factors affecting the option price, such as strike price, remaining term, and underlying asset price. e ZeroPadding layer, pooling layer, and activation function are also adapted to actual input data size and output data type.

Genetic Algorithm.
Option pricing with Heston model requires estimation of the model parameters firstly, which have five parameters to be estimated through an equivalent martingale measure transformation. In this paper, genetic algorithm is used to estimate parameters of Heston model [36]. Genetic algorithm is a random global search and optimization method based on biological genetic and evolutionary mechanism created by Professor Holland at the University of Michigan, inspired by biosimulation techniques, which overcomes the disadvantage of traditional algorithms that are prone to fall into local extremums. Genetic algorithm obeys the law of survival of the fittest in nature and simulates the phenomena of inheritance, crossover, and mutation that occur during the biological evolution. Starting from a randomly generated generation of populations, the optimal solution to the problem is obtained through the selection of fitness functions and the generation after generation of individuals that are best adapted to their environment through selection, crossover, and mutation. e basic flow of genetic algorithm is shown in Figure 3. e specific steps of the algorithm to estimate Heston model parameters are as follows.

Step 1. Population Initialization.
Encoding methods can be divided into three main categories: binary encoding, symbol encoding, and floating-point encoding, where we adopt binary encoding. e binary encoding method uses a fixed-length string of binary symbols to represent an individual. If there are multiple parameters, the binary strings corresponding to each parameter are concatenated as an individual.
e binary string length of each parameter is determined according to the value interval and required precision. Suppose that the value interval of parameter x j is [a j , b j ] and the required precision is k, then the calculation formula of binary string length m j is Set the binary string as U 1 U 2 U 3 . . . U m j , U i � 0 or 1; then the binary string is converted to the actual value of the corresponding parameter; i.e., the decoding formula is  Discrete Dynamics in Nature and Society

Step 2. Calculate the Fitness Value.
Fitness is a function calculated to evaluate the fitness of each individual according to the optimization objective. In this paper, the optimization objective and the fitness function are given by the following equations, respectively: where Ω � V 0 , κ * , θ * , σ, ρ , C H i (Ω) and C M i (Ω) represent the Heston model price and the market price of the i th option, the market price of the option is the closing price, and n is the sample size. 2 and Ω j represents the group j parameter in the population.

Step 3. Select Operation.
e selection is made according to the fitness value. e higher the fitness value, the greater the probability of being inherited to the next generation. Selection methods include roulette selection, random competition selection, and optimal reservation selection. In this paper, we use the roulette selection method. Each individual divides a disk into M sectors according to the probability P j � F j / M k�1 F k , where M is the population size. Rotating the disk randomly, we select the individual at which the disk stops where the pointer falls.

2.3.4.
Step 4. Crossover Operation. Crossover operation refers to two paired chromosomes (individuals) exchange part of genes with each other in a certain way; the main methods are single point crossover, uniform crossover, multipoint crossover, and so on. Due to the quantity of 5 parameters and high precision of Heston model, we choose three-point crossover operation. Figure 4 is an operation diagram, where the cross point is randomly generated according to the crossover probability.

2.3.5.
Step 5. Mutation Operation. Mutation refers to a change in the value of a bit or bits on an individual coding string. e main methods include basic position mutation, uniform mutation, and boundary mutation. In this paper, we adopt basic position mutation. e specific operation method is to randomly determine the gene sites to be mutated in each individual according to the mutation probability and then change the "1" in binary code into "0" or "0" into "1." 2.3.6. Step 6. Determine If Evolution is Over. In this paper, the judgment criterion is whether the initial set of evolution algebra has been reached. If not, go back to Step (2). If so, the algorithm will be terminated.

Hybrid Modelling Framework.
In summary, the model framework of hybrid CNN based on Heston model established in this paper is shown in Figure 5. e key steps of the modelling process are as follows.

Step 1. Parameter Estimation of Heston Model.
Taking the mean square error between Heston prices and market prices as the optimization objective, the genetic algorithm is used to estimate the five parameters V 0 , κ * , θ * , σ, ρ of Heston model.

2.4.2.
Step 2. Extract the Deviation Sequence. Combining the parameters obtained by genetic algorithm, the option price C H was obtained by Heston model, and the deviation sequence y was obtained by subtracting the model price C H from the market price C M .

2.4.3.
Step 3. Build the ResNet50 Model. Taking y as the expected output and the option sample information as the input data, the training sets and test sets are divided Discrete Dynamics in Nature and Society according to the obtained deviation sequence y and the corresponding option sample information, and then ResNet50 can be trained.

2.4.4.
Step 4. Correct Pricing Deviations. erefore, the trained hybrid model is used for data fitting or extrapolation analysis, and the option pricing result of Heston model after parameter optimization is integrated with the prediction result of ResNet50 deviation to complete the correction of pricing deviation, so the option pricing result of the hybrid model can be obtained as C.

Data Selection and Preprocessing.
is paper studies the call options of Huatai-PB CSI 300ETF, Harvest CSI 300ETF, and SSE 50ETF. e expiration months of the three options are the current month, the next month, and the following two quarterly months, and the expiration date is the fourth Wednesday of the expiration month. According to the above contract provisions and to reflect the integrity of data characteristics, the daily data from February 19 to December 31, 2021 are selected as model fitting samples and training samples in this paper. e daily data from January 4 to January 14, 2022 are selected as the prediction samples to conduct extrapolation analysis for each model. e daily data on February 19 and February 20, 2021 are selected as the parameter estimation data of Heston model. e above data are all from Wind database.
Data variables selected according to the model include strike price (K), remaining term (T − t, unit: year), closing price of the underlying asset (S), closing price of the option, risk-free interest rate (r), and historical volatility of the underlying asset price. In this paper, Shanghai Interbank Offered Rate (SHIBOR) is selected as the risk-free interest rate, with maturities of 3 months, 6 months, 9 months, and 1 year. e risk-free interest rate for the remaining maturities is calculated using linear interpolation of rates for adjacent maturities. e historical volatility of the underlying asset price is calculated using the volatility of the previous 30 trading days, a total of 250 trading days in a year. e specific calculation formula is S � is the closing price of the underlying asset in the previous trading day, and P i+1 is the closing price of the underlying asset in the current trading day.
Original data needs to be preprocessed, excluding nontrading day data, data of nonstandard contracts, data of option bid or offer price with null value and value less than 0.001, and data of underlying asset price with null value. e sample data table of the three ETF options is shown in Table 1.

Pricing Error Measure.
In this paper, the mean squared error (MSE), mean absolute percentage error (MAPE), and mean absolute error (MAE) are selected to evaluate the model performance by referring to the comparative measures of Zhang et al. [32] and Yang et al. [33].  Three-point Crossover Figure 4: Schematic diagram of three-point crossover operation. 6 Discrete Dynamics in Nature and Society where C M is the market price of the option and C is the estimated price of the option model.

Parameter Estimation of Heston Model.
Considering the precision and efficiency of parameter estimation, the parameters of genetic algorithm are set as population size 20, parameter precision 6 (retaining six decimal places), selection probability pc � 0.6, mutation probability pm � 0.1, evolutionary generation 500, and parameter value ranges κ * : }. An initial population is randomly generated and converges to an optimal individual after 500 iterations. e parameter results of Heston model are shown in Table 2, which will also be used in hybrid modelling.

ResNet50 Model Adjustment.
When different types of options are applied to ResNet50, the input data in line with the actual situation should be constructed, and the Zer-oPadding layer and pooling layer should be adjusted accordingly. is paper empirically selects the call ETF options in China and makes the following adjustments to ResNet50 model. e adjusted structure is shown in Figure 6.
(1) Adjustment of input size: the input sample point consists of 9 options with the same expiration month at time t: 1 at-the-money options, 4 in-the-money options, and 4 out-of-the-money options stipulated in the contract. Each option includes the data of five variables: strike price K, remaining maturity time T − t, underlying asset price S at time t, risk-free interest rate r, and historical volatility of the underlying asset. erefore, the size of the input data is 9 * 5 * 1.
(2) Add a ZeroPadding layer before the initial convolutional layer to make the data size 9 * 9 * 1. (3) Due to the small data size, the pooling layer of the original model is removed. (4) e output data is the expected pricing deviation of the Heston model optimized by the genetic algorithm for 9 options; i.e., the output is a 9-dimensional vector. (5) e pricing deviation is positive or negative, so the activation function of the whole connection layer is adjusted from softmax to tanh. (6) ResNet50 model is used for regression problems, so the loss function is adjusted to mse.

Hyperparameter Setting of Neural Networks.
In this paper, we use BP neural network model to make a comparison to the performance of ResNet50 model in correcting the option price deviation of Heston model. BP model consists of three layers, and the number of neurons in the hidden layer is calculated according to formula ����� m + n √ + a, where m and n are the number of neurons in the input layer and output layer, respectively, and a is an integer between [0, 10]. In the empirical trial by trial, the error and training time are compared, and finally the hidden layer of 6 layers is determined to be the optimal number of layers. In order to improve the accuracy of the model and ensure the speed of training convergence, the batch size is 64, the sample data training times (Epoch) is 500, and the ratio of training set to test set is 4 : 1. ResNet50 model is different from BP model in the selection of learning rate. Learning rate is tested and selected from 0.01, 0.001, and 0.0001. e results show that the respective loss functions of ResNet50 model and BP model change at reasonable rates for learning rates of 0.0001 and 0.01, respectively. Figure 7 shows the change in loss for ResNet50 model training option deviation data. e horizontal coordinate is Epoch value and the vertical coordinate is loss/val_loss value. In addition, due to the large number of ResNet50 model parameters, in order to increase the amount of training data, when constructing input samples as described previously, the row data of each sample is rearranged into 9 different sample points, and the output results were averaged. e rearrangement is shown in Figure 8, where each number represents a row of option data.      Discrete Dynamics in Nature and Society    Table 3 lists the fitting results of six different models applied to data sets of three listed ETF options in China. MSE, MAPE, and MAE of Heston-GA model are basically lower than those of BSM model, indicating that Heston model will be closer to the actual market situation after relaxing the BSM model's assumption that volatility is constant. In the comparison of three option fits, it can be concluded that the ResNet50 model does not outperform the other four parametric models, i.e., BSM, Heston, Heston-CIR, and nonaffine Heston. is supports that the ResNet50 model indeed lacks an explanation for the factors affecting option prices, leading to a large deviation in option pricing. e above conclusions confirm the necessity of hybrid modelling. Table 3 demonstrates the basic reduction in three measures for two hybrid models over five single models, with the ResNet50 hybrid model offering a modelling advantage over the BP hybrid model.
To validate the marketability of the model developed in this paper and to show that the trained ResNet50 model does not suffer from overfitting problems, an extrapolation analysis is performed on the model. We perform model pricing error analysis with data not involved in NNs training, i.e., out-of-sample predictive analysis. Table 4 illustrates the prediction errors of each model. e relative prediction error of each model is basically similar to the fitting error in Table 3, where three measures of Heston-GA-ResNet50 model of three ETF options all decrease. In particular, it can be argued from the empirical evidence on 510300 ETF options that if the hybrid model performs poorly in the fitting experiment, it will perform similarly in the prediction. erefore, if the NNs models are to be applied to the real market, the training of the model determines the quality of the model. In general, an important observation can be derived from Tables 3 and 4, i.e., on three options data, MSE, MAPE, and MAE of Heston-GA-ResNet50, which are commonly smaller than those of the other six comparison models. is implies that correcting Heston price bias with ResNet50 gives more accurate results. As for the Heston optimization models, i.e., Heston-CIR and nonaffine Heston in this paper, they do perform relatively not badly for 510300 and 159919 ETF options in Table 4 empirically. But the combined information in Tables 3 and 4 demonstrates that their pricing effects are uncontrollable, which are limited by the market factors and the accuracy of the parameter estimates. is validates the previous statement that parametric option pricing models lack the ability of capturing nonlinear factors, i.e., robustness, to a certain extent compared to nonparametric models. To sum up, Heston-GA-ResNet50 model has the highest accuracy and robustness and better pricing performance, which has certain application significance in the option market. is point will also be demonstrated in the following empirical analysis.
To better illustrate the applicability of the Heston-GA-ResNet50 model, the option contracts for January 5, 2022, including the four maturity months of January, February, March, and June, are chosen as examples.   are essentially closer to the actual market price curves for both short and long remaining terms of the options, for which the conclusion further confirms the superiority of Heston-GA-ResNet50 model consistently with the previous empirical findings.

Conclusions
In   parameters of Heston model are optimized through genetic algorithm. In particular, we propose a new way of constructing data facets by combining data from several option contracts according to some criteria. is image-like data is suitable as input data for CNN, as it is in this paper for the adjusted ResNet50 model. Our approach provides a reference for the application of CNN models or existing classical CNN models to options pricing, which has profound theoretical implications. en, the empirical study on three kinds of listed ETF call options shows that our hybrid model is superior to other models in both terms of fitting and prediction. For the comparison, we not only use BSM model and Heston model, but also add BP hybrid model and two Heston optimization models, namely, Heston-CIR model and nonaffine Heston model. e more outstanding performance of our model compared to these excellent models confirms the validity of the modelling ideas in our research. Further analysis of the applicable scope of model pricing shows that Heston-GA-ResNet50 model has a high accuracy for both short and long maturities. And, compared with other models, Heston-GA-ResNet50 model has a stronger applicability for long maturities. erefore, the modelling process of our hybrid model is effective, in which the parametric pricing model ensures the interpretability of the hybrid model and the nonparametric model enhances the nonlinear fitting ability of the hybrid model, making our Heston-GA-ResNet50 model more accurate and robust. In summary, the model proposed in this paper is of both theoretical and practical significance, which will provide a reference for academics and investors.
In the future research, we will extend our model to the pricing application of other kinds of options. Besides, we will also concern how to establish a hybrid model with low complexity, fast training speed, high accuracy, and strong applicability in future improvement.