A Static-Dynamic Hypergraph Neural Network Framework Based on Residual Learning for Stock Recommendation

,


Introduction
Stock investment has evolved into a signifcant avenue for generating personal or institutional profts.As of the frst quarter of 2022, the total market capitalization of major global stock markets has surpassed an astounding $105 trillion.Predicting stock market trends accurately is a formidable task due to the nonstationary nature and the high volatility of the market.While the feasibility of precise prediction remains controversial, decades of ongoing research in stock market prediction have yielded impressive results.To a certain extent, it suggests that the stock market is predictable.
Stock data are typically in time series format, which can be analyzed by various methods such as autoregressive models, recurrent neural networks (RNNs) [1,2], and transformer [3].Tese methods have achieved successful applications in trend prediction within the fnancial domain.However, many previous studies that rely on neural network methods frequently treat the prediction problem as an isolated fnancial time series, which overlooked the intricate relationships between stocks.Recent research has demonstrated that the spatial dimension underlying time series data in fnancial markets, which represents the relational information among stocks, signifcantly impacts the accuracy of predictive models [4][5][6].Tere are two main types of relationships among the stocks: static relationships that remain stable over extended periods and dynamic relationships that continually evolve over short time frames.For instance, the relationship between an upstream company and a downstream company in a supply chain is a static relationship.On the other hand, dynamic relationships may emerge due to unforeseen events, such as emergencies, and disappear as the situation resolves.Graph neural networks (GNNs) are extensively employed models [7,8] for modeling these relationships.It uses an adjacency matrix to illustrate the correlations between pairs of stocks.Te nodes and edges in the graph represent stocks and their relationships, respectively.Te infuence between each pair of stocks can be captured by node representation learning.However, stock price movements can also be infuenced by synergistic factors in real stock markets, such as industry-specifc policies or common suppliers among companies.Tese synergistic relationships naturally group stocks together, which extends beyond individual pairs.Te graphs of GNNs may not be sufcient to describe these complex relationships between stocks.
As a natural extension of GNNs, the hypergraph neural networks (HGNNs) introduce an incidence matrix to construct a relational representation of stock groups [9,10].Te incidence matrix is indexed using hyperedges for columns and stocks (nodes) for rows.Within a hypergraph, a stock (node) can be afliated with multiple hyperedges, signifying that this stock possesses multiple attributes.Similarly, a hyperedge can encompass multiple stocks (nodes), indicating that the stocks sharing the same hyperedge have the same properties.In contrast to GNNs, HGNNs excel in capturing a broader spectrum of synergy information.However, due to the complexity of the interaction among stocks, there are still two major challenges in this application: (1) Te construction of the adjacency matrix for the graph and the incidence matrix for the hypergraph relies on preexisting information.It can potentially lead to the omission of certain valuable relationships.Consequently, this approach may not be well suited for capturing dynamic relationships among stocks.As shown in Figure 1, the stock 300750.SZ is the leading lithium battery manufacturer.Te other two stocks 300343.SZ and 603993.SH are two upstream companies in its supply chain.Te price movements of the stock 300750.SZ and the stock 603993.SH exhibited a high degree of similarity during the time period before the marked red box, which attributed to the latter serving as a raw material supplier for the former.After the marked red box, all three companies' stock prices displayed similar fuctuations because stock 300343.SZ had become a signifcant component of the supply chain for 300750.SZ.If prior information is artifcially provided, there is a risk that stock 300343.SZ might not be considered in the stock prediction for 300750.SZ.Tis highlights the limitations of relying solely on predetermined relationships and the importance of adapting to evolving dynamics in stock markets.(2) Previous methods have segregated the utilization of GNNs and HGNNs.Te former excels at capturing pairwise relationships, while the latter shines in capturing synergistic relationships.However, these two signifcant advantages provided by each model have not been combined yet.
To tackle the aforementioned challenges, we propose a static-dynamic hypergraph neural network based on residual learning framework for the stock recommendation, abbreviated as SD-RL.Te overall SD-RL framework is illustrated in Figure 2. Specifcally, we break down fnancial time series data along two dimensions: time and space.In the time dimension, we begin by feeding historical data for each stock into the GRU network.We employ an attention mechanism to discern the varying signifcance of diferent trading days.Concurrently, this network capture sequential dependencies and acquire sequential embeddings of stocks.Within the space dimension, we incorporate residual learning to uncover latent relationship information [11].We employed a data-driven approach to learn both the inherent static relationships between stocks and the time-evolving dynamic relationships through the graph learning module and the hypergraph learning module, respectively.Ultimately, the prediction module amalgamated these latent relationship information streams derived from multiple modules (static graph and dynamic hypergraph modules) to forecast stock trends and pinpoint stocks with promising potential.In summary, our primary contributions can be summarized as follows: ( Te results demonstrate the efectiveness and rationality of our approach.
Te remainder of this paper is organized as follows.In Section 2, we review the related work and highlight the unresolved problem.Section 3 presents the technical details of the proposed framework.Experimental setups are described in Section 4. Section 5 concludes this study and outlines future research directions.

Related Work
Tis section provides a review of traditional stock forecasting methods and the existing literature on graph neural networks.Hypergraph learning is also briefy discussed.
In the feld of academic research on stock forecasting, two main types of methods have been explored: statistical methods and machine learning methods.Statistical models, like the autoregressive moving average model, the autoregressive integrated moving average model, and its variants, have been used to forecast moving average stock prices.However, these early works relied on handcrafted features, which frequently led to predictions lagging behind the actual price movements.In recent years, machine learning algorithms such as logistic regression (LR) and support vector machine (SVM) have shown promising advancements in stock forecasting.With the continuous enhancement of Traditional stock forecasting methods frequently fall short in leveraging the intricate interplay among stocks.Chen et al. [17] innovatively established networks of interconnected companies based on real-market investment events.Ten, they enhanced the relationship between stocks through graph convolutional neural networks, resulting in more precise prediction outcomes.Based on this, Feng et al. [4] framed stock prediction as a ranking task.Tey profciently addressed the interplay among distinct stocks by aggregating company metadata and encoding time-sensitive stock relationships.Similarly, Xu et al. [18] harnessed graph-  Complexity based techniques to model pairwise relations among stocks by utilizing sector industry metadata and key business data of companies.Additionally, Cheng et al. [19] meticulously crafted a heterogeneous graph by amalgamating events, news, relationships, and market data within a knowledge graph.Tis multimodal input fusion signifcantly bolstered the fnancial prediction capabilities of the model.However, one limitation of the graph-based methods is their reliance on prior knowledge for constructing static relationships.To address this constraint, various endeavors have been made to uncover hidden relationships in graphs [20,21].
In recent times, hypergraph learning has witnessed substantial advancements in addressing problems, which involve relationships among data extending beyond pairwise interactions.Applications span diverse domains, including visual object recognition [22], trafc prediction [23], recommender systems [24], and social networks [25].In the context of stock recommendation tasks, the utilization of hypergraphs involves categorizing stocks into multiple groups and refning their representations.Sawhney et al. [5] introduced a spatiotemporal hypergraph convolutional network (STHGCN) based on the predefned industrial relationships.Tis innovative method combined gated temporal convolution with hypergraph convolution in the spectral domain.It enabled capturing the evolution of stock prices and relationships in a spatiotemporal-aware manner.To further enhance the propagation of hypergraph information, Cui et al. [26] put forth a hypergraph triple attention network for stock trend prediction based on the foundation of STHGCN.Tis method augmented stock trend prediction by explicitly modeling group industry afliations among stocks through a hypergraph attention module.Moreover, Li et al. [27] presented a reinforcement learning approach grounded in hypergraph-based methods for stock portfolio selection.Tis technique enhanced investment selection by incorporating a hypergraph attention module to represent industry afliations efectively among stocks.
Based on the research mentioned earlier, it becomes evident that the majority of existing graph-based or hypergraphbased deep learning methods necessitate the incorporation of domain-specifc prior knowledge.However, the construction of these models heavily relies on resource-intensive strategies for acquiring relations.Additionally, a limitation of existing hypergraph neural networks is their lack of specialization for temporal learning involving time-evolving features, such as daily stock prices.Hence, in this paper, we adopt a data-driven approach to learn the inherent static connections among stocks and design an efcient hypergraph construction algorithm to capture the evolving dynamic relationships over time.As a result, we achieve the recommendation task by integrating information fows from both static and dynamic time series sources.

Framework
In this section, we present a static-dynamic hypergraph neural network framework based on residual learning for end-to-end stock recommendation.As shown in Figure 2, this framework comprises three parts: feature extraction, residual learning, and ranking and optimizations.

Temporal Feature Extraction.
Historical stock price data have demonstrated their efcacy in forecasting future stock price trends [28].To capture the time series evolution characteristics of individual stocks, we employ the Attentive GRU model.Te Attentive GRU comprises two essential components: the GRU layer and the temporal attention mechanism.In contrast to conventional RNN models, this confguration excels in capturing long-range dependencies while maintaining a concise structure.

GRU Layer.
For each stock (take stock s i as an example), the GRU Layer is used to map , where F h is the dimension of the hidden representations.

Temporal Attention Layer.
Te attention mechanism is widely employed in various sequence learning problems [3,14].Over the last L trading days, the hidden representations of diferent time steps have varying degrees of infuence on the overall hidden representations of the sequence.As a result, we utilize the attention mechanism to combine the hidden representations at diferent time steps in the following manner: where W k and W q ∈ R F h ×F h are parameters to be learned.Finally, the overall hidden representations of N stocks in the  [29].
Motivated by this, we aim to leverage graph learning methods to acquire static stock graphs.Tis process can be described as follows: where A is the adjacency matrix of the static graph; E i ∈ R N×F e represents the randomly initialized stock embeddings; θ i ∈ R F e ×F e are learnable parameters; tan h and LeakyReLU are activation functions; and α is a hyperparameter for controlling the saturation rate of the activation function.
Because companies with diferent market capitalization have diferent impacts on the market, static stock relations should be asymmetric.We retain the positive relations among the learned static relations to get the adjacency matrix A + .

Static Graph Convolution.
To ensure that the stock nodes preserve their original information during the information propagation process, we retain a portion of the original hidden state of the stock time series, which is illustrated as follows: where X l t,s ∈ R N×F h represents the input of the l th static graph convolution layer at time step t; X 0 t,s � X t,h ; We feed the output of static graph convolution X l+1 t,s into two fully connected layers with LeakyReLU activation functions σ(•) to generate backcast X b t,s and forecast Y f t,s : where W b s and W f s are parameters to be learned.Each hyperedge is assigned to a positive weight in the hyperedge set.All weights are stored in the diagonal matrix W ∈ R |E|×|E| , and the initial weight of matrix is one, which means that each hyperedge is treated equally.It has been proved in [30] that a hypergraph degenerates into an ordinary graph if and only if each hyperedge is associated with two vertices.

Incidence Matrix.
Te incidence matrix in hypergraph theory reveals the relationship between hypergraph vertices and hyperedges.For an undirected hypergraph incidence matrix with no isolated points H ∈ R |V|×|E| , it is defned as: For a vertex in a hypergraph, its degree is defned as the sum of all hyperedge weights associated with the vertex: Similarly, the degree of a hyperedge is defned as: where D ∈ R |V|×|V| and B ∈ R |E|×|E| are both diagonal matrices.

Dynamic Hypergraph Construction.
Dynamic hypergraphs are devised to unveil additional information concealed within static graphs.Notably, there are abundant signals in the temporal price movements of related stocks.By drawing inspiration from [18], we employ a residual architecture to extract the information fow of each stock after fltering out the static relationships.As shown in Figure 3 (left frame), we initialize the hypergraph structure with the input feature embedding X t,h − X b t,s .Dynamic hypergraphs are created through the following steps.We start with N hyperedges, where each hyperedge "e" is initialized with the feature embeddings of its corresponding nodes "v."Subsequently, each node identifes the hyperedge closest to itself, excluding the hyperedge initialized with its own feature embedding.Following that, each Complexity node is incorporated into the closest hyperedge, potentially leading to the removal of hyperedges which did not contain any nodes.For those hyperedges that persist after this process, the nodes used in their initialization are also added to them.Lastly, it is worth noting that among these N nodes, some nodes join a single hyperedge while others join two hyperedges (the ones whose initialization hyperedges have not been deleted).For nodes joining two hyperedges, we employ the knearest neighbor algorithm to fnd the k closest nodes to form a new hyperedge.Te hyperedge set is dynamically adjusted when the feature embeddings evolve with the network going deeper.In our proposed method, the incidence matrix Η dynamic is obtained through the dynamic hypergraph construction algorithm.Te entire process of dynamic hypergraph construction is shown in Algorithm 1.

Hypergraph Convolution.
Hypergraph convolution is rooted in spatial domain graph theory, which conceptualizes hypergraph learning as a process of information exchange among interconnected nodes with neighbor relationships [10].Te input of the l th hypergraph convolution layer is the feature embedding X l t,d , X 0 t,d � X t,h − X b t,s .As shown in Figure 3 (right frame), the hypergraph spatial domain convolution updates the feature X l t,d to a new feature X l+1 t,d .Tis closed-loop message-passing cycle mechanism involves two-stage directed message fow propagation, which can be described in matrix form as follows.
Te frst stage: where Θ is a learnable parameter matrix; LeakyReLU is a rectifed linear unit activation function; and D, B, H, and W have been defned in Sections 3.4.1 and 3.4.2.We set W � I, H � H dynamic .Similar to the static graph module, the dynamic hypergraph module also has two output branches, i.e., backcast X b t,d and forecast Y f t,d : where W b d and W f d are learnable parameter matrices.To mitigate the impact of both static and dynamic relationships, we introduce an independent module for capturing individual stock time series information.W f i is a learnable parameter matrix.
A combination of pointwise regression loss and pairwise rank-aware loss is used to optimize SD-RL: where y t+1 is the true value and β is a hyperparameter to balance the two loss terms.Te former of the loss function minimizes the diference between the predicted ranking and the true ranking.Te latter encourages the predicted ranking of a stock pair to have the same relative order as the true ranking.

Experiments
In this section, the details of the dataset, training settings, and experimental evaluation metrics are provided.Subsequently, a series of experiments are conducted and the results verify the efectiveness of the proposed SD-RL method.

Dataset and Training Settings
4.1.1.CSI 300 and CSI 100 [18].Te CSI 300 consists of the most representative 300 stocks in Shanghai and Shenzhen A-shares.Te CSI 100 is made up of the 100 largest stocks in the CSI 300 stock set.We utilized the most recent 60day raw data of these stocks, including opening prices, highest prices, lowest prices, closing prices, trading volumes, and volume-weighted average prices.Te stock data from the CSI 300 and CSI 100 were collected from 01/01/ 2007 to 12/31/2020.We then divided this dataset into training, validation, and test sets in chronological order.Our model was trained, and its parameters were fnetuned until it achieved the best performance on the validation set.We conducted each experiment fve times and computed the average model performance on the test set to ensure robust results.[4].Considering the inherent volatility of fnancial markets, we conducted additional experiments to assess the robustness of our model across diferent market conditions.For this purpose, we utilized two publicly available stock datasets created in [4], where the authors compiled price records for 1026 NASDAQ and 1737 NYSE stocks spanning from 01/02/2013 to 12/08/2017.For both datasets, only the information of stock industrybelonging relationships is provided.

Training Settings.
For our proposed framework, we employed the Adam optimizer to fne-tune parameters and set the initial learning rate at 0.0002.Here are the specifc parameter ranges used in our experiments: the hidden state size of GRU within (32, 64, 128), the embedding size of the static graph module within (32, 64, 128), the ranges of saturation rate α from 0.5 to 3, the ranges of parameter λ from 0 to 0.6, the ranges of parameter k in the k nearest neighbor algorithm from 2 to 16, and the ranges of β from 0.1 to 2. Te models were implemented using the PyTorch framework, and we conducted a grid search to determine the optimal hyperparameters.

Evaluation.
We assessed the efectiveness of our approach in terms of the ranking performance.Building upon prior research [18], we evaluated the prediction results using two commonly used metrics: where corr(•) is the Pearson correlation coefcient and rank y t and rank  y t are the labels and predicted rankings from high to low, respectively.In addition, we also use another metric Precision@N to evaluate the accuracy of the N predictions of the model.Assuming N equals 10, 5 labels out of the top 10 predictions are positive.Precision@10 equals 50%.In order to compare with existing research, we also set N to 3, 5, 10, and 30 to evaluate the models.

Experimental Results and Analysis.
To validate the superior ranking performance of SD-RL, we conducted a comparative analysis against several existing baseline models.Tese baseline models include the following.[15].It is a variant of the LSTM network that decomposes the hidden state of the LSTM storage unit into multiple frequency components, efectively memorizing time series information of diferent frequencies.[14].It is a variant of the LSTM network with better generalization ability, which uses increased adversarial training to simulate the randomness in the model training process.[16].It is an extension of ALSTM that leverages a temporal routing adapter (TRA) to learn multiple trading patterns in stock market data.[3].It is a stock trend prediction model based on the transformer architecture, which integrates a multiscale Gaussian prior with a self-attention mechanism to model temporal context information.[8].It is a variant of graph convolutional networks that aggregates time series feature embeddings extracted by a GRU network using an attention mechanism.[18].It is a graph-based neural network that extracts concept-oriented shared information to forecast stock trends.

HIST
Table 1 presents a summary of the ranking performance achieved by various methods across diferent values of N. Notably, ALSTM + TRA outperforme other models in terms of both IC and Rank IC metrics among those models that do not integrate relational information.Hence, when assessing these metrics with nonrelational models, we solely compare our model's results with the ALSTM + TRA model.Furthermore, within the domain of graph-based models, HIST showcases superior performance compared to GATs and other baseline models that do not integrate relational information.It is worth emphasizing that, while HIST relies on a predefned graph structure, our approach SD-RL excells by profciently capturing both static and dynamic relationships through a data-driven methodology.Tis results in signifcantly improved performance across multiple metrics.To be more precise, on the CSI 100 datasets, SD-RL outperforms the second-place model by an average margin of nearly 3.3% and 2.50% in terms of IC and Rank IC, respectively.Furthermore, we continue our experiments on the NASDAQ and NYSE stock datasets.Table 2 shows the ranking performance of diferent methods.It is noteworthy that the overall performance of all models in the A-share market surpass that in the US stock market.Tis diference in performance could be attributed to the relatively shorter data period covered by the US stock dataset and the larger number of stocks it comprised.Our model achieves promising prediction results, consistently standing at the top two positions in most evaluation metrics.It can be found that the performance of graph-based models have signifcantly improved by leveraging the relationship information.In addition, we also fnd that HIST achieves signifcantly better results on the NYSE stock datasets than NASDAQ, which could be attributed to the fact that HIST defnes the static relationships at the beginning (leveraging the prior information of industry relationships).Industry relationships refect more of long-term correlations between stocks, and NASDAQ is more susceptible to short-term factors.However, the variation of our model on these two datasets is relatively small.It could be attributed to the fact that our model learns the static-dynamic (long and short term) relationships between stocks over time in a data-driven manner.

Model Component Ablation Study.
To investigate the contributions of diferent components in SD-RL, three variants of SD-RL were designed, namely, "GRU + Attn," "GRU + Attn + Sta," and "GRU + Attn + Sta + Dy." Table 3 compares the performance between SD-RL and the variant methods.It is evident that SD-RL without any relational information (GRU + Attn) yields the least favorable results, underscoring the value of considering interrelations among stocks.Additionally, we observe that the model which integrates both static and dynamic relationship information (GRU + Attn + Sta + Dy) outperforms the one that solely focuses on static relationships (GRU + Attn + Sta).Tis afrms the signifcance of both static and dynamic relationships, highlighting that neither of them should be disregarded.Notably, SD-RL (GRU + Attn + Sta + Dy + Ind) attains the most favorable outcomes, providing further validation that amalgamating the output information from distinct modules enhances the capacity of feature embedding to capture trends.

Visualization of Embeddings.
To further validate the efcacy of our proposed model, we employt-distributed stochastic neighborhood embedding (t-SNE) [31] to 8 Complexity project two types of stock embeddings into a 2D space: one originating from the GRU and the other from our method.Figure 4 shows these two types of embeddings from 100 stocks in the CSI 100 dataset.Stocks belonging to the banking and brewing industries are highlighted with bright colors.Stock embeddings in the same industry should be as similar as possible.To achieve this, we circle the labeled stocks (depicted as dots) belonging to the same industry, with the circle's size serving as an indicator of the clustering degree of the stock embeddings.Smaller circles indicate a higher level of aggregation.Tis observation showes that compared with GRU embeddings, the stock embeddings within the same industry in SD-RL are signifcantly more clustered (circles are smaller), and the embeddings of stocks in diferent industries are more dispersed.Te above results show that our method is more efective in capturing and preserving the correlation among stocks in the same industry, and the learned stock embeddings are more discriminative.In addition, we also notice that BYD (a new energy vehicle company) is an isolated point in GRU embedding, while there is a relatively close spatial projection distance between the embedding of BYD stock and the embedding of Huaneng Power (an energy company) stock in SD-RL.Tis also confrmes the ability of our proposed method to capture hidden relationships between diferent companies.Te results reveal a pattern of improvement followed by degradation.Te optimal results are achieved when "k" = 8.
If "k" is excessively large or small, it diminishes the capacity to represent stock embedding trends, resulting in less discriminative learned stock embeddings.

Conclusions
It is a challenging but highly valuable task to recommend stocks by predicting the daily ranking of stock price changes.In this paper, we propose a static-dynamic hypergraph neural network framework based on residual learning to predict the ranking of stocks.Te SD-RL framework ofers signifcant advantages in modeling highorder data correlations and uncovering latent relationship information.Te efectiveness of our proposed model is validated on two real-world market datasets.First, we compare the prediction performance of our model against various baseline models to assess its feasibility.Second, we construct and analyze three diferent variants of the SD-RL model to understand the infuence of each component.Tird, the feature embeddings of SD-RL and the GRU network are visually represented in a twodimensional space.Additionally, we investigate the impact of various hyperparameter values on model performance.Experimental results indicate that our proposed approach is more practical and well suited for the realworld applications compared to the existing methods.It equips investors with the crucial information to make proftable investment decisions.Furthermore, our model can extend its applicability in the analysis of other graph data felds, such as trafc fow prediction.
In the future research, we plan to explore the integration of multisource information, including online fnancial news and social media data based on our model.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon request.

Figure 3 :
Figure 3: Te illustration of hypergraph construction and hypergraph convolution.In the process of hypergraph construction, we follow the following steps.Initially, each node search for its nearest neighbor, which is marked as a star node in the fgure.For the sake of clarity in our illustration, we depict two-star nodes in the fgure, representing the closest points among the surrounding nodes.Tese star nodes are enclosed within blue dotted lines, efectively forming hyperedges.Next, we treat the star node as the central node and create a hyperedge by applying the k-nearest neighbor algorithm.Te connection is denoted by a red dotted line (in the fgure, it is indicated as 2-nn).

Complexity 4 .
3.1.SFM (i) GRU + Attn: Only the temporal feature extraction module in SD-RL is retained.It is used to verify the impact of adding a temporal attention layer to the GRU neural network.(ii) GRU + Attn + Sta: Only the temporal feature extraction module and static graph module in SD-RL are retained.It is used to verify the impact of the static graph module in improving performance.(iii) GRU + Attn + Sta + Dy: Te temporal feature extraction module, static graph module, and dynamic hypergraph module in SD-RL are retained.It is used to verify the impact of integrating the two information fows from the static graph and dynamic hypergraph.(iv) GRU + Attn + Sta + Dy + Ind: the complete model (SD-RL) is used to verify the impact of combining the three types of information fows from static graph, dynamic hypergraph, and individual stock time series information.

) and 5 (
b) illustrate the impact of "k" (the knn algorithm in the dynamic hypergraph construction module) on performance.
Formulation.Let a stock set S � s 1 , s 2 , . . ., s N   denote N individual stocks.We collect the historical price records of each stock in the past L days Χ ∈ R N×L×F . is the closing price of the stock s i on trading day t.Given Χ t ∈ R N×F , the purpose of our model is to learn a function f(Χ t , Θ), which maps Χ t to the ranking scores, and get a score ranking list h t is a matrix of the overall hidden representations of N stocks, which represents the current state of each company under the movement of the stock price, X t,h �  h t ,  h t ∈ R N×F h .

Table 1 :
Ranking performance of diferent methods on the China's A-share dataset when considering diferent numbers of N.Te best result in terms of each metric is indicated in bold.

Table 2 :
Ranking performance of diferent methods on the NASDAQ and NYSE datasets when considering diferent numbers of N.
Te best result in terms of each metric is indicated in bold.

Table 3 :
Te results of ablation study.Te best result in terms of each metric is indicated in bold.To investigate the performance of SD-RL with diferent hyperparameter values, we maintain the default settings mentioned in Section 4.1 while varying a single hyperparameter at a time.Figures5(a Complexity 4.6.Hyperparameter Analysis.