Assessment of Pilots’ Cognitive Competency Using Situation Awareness Recognition Model Based on Visual Characteristics

,


Introduction
Situation awareness (SA) enhancement among ship pilots is critical to lowering anthropogenic errors, which have occupied 75%-96% of maritime accidents in recent years [1,2].With the increasing size, speed, and trafc density of ships, enhancing operational safety for pilots has become a pressing issue [3].However, due to the complexity of marine systems and the growing integration of intelligence and automation, pilots face an increasingly daunting task in comprehending the current situation and predicting future changes [4,5].To efectively prevent unsafe behaviors among pilots, it is necessary to recognize their SA levels from a cognitive perspective, particularly in emergency situations during ship pilotage [6].However, the requirement for pilots to maintain high-SA levels, along with individual diferences and empirical aspects of pilotage, can create measurement gaps [7].Terefore, recognizing SA levels as a means of avoiding human errors becomes a complex process that requires further investigation, especially in ship pilotage emergencies.
Current research primarily focuses on statistical methods to determine the correlation between poor visibility and real-world marine navigation accidents [3].Since poor visibility deteriorates pilots' perception of nearby contextual changes, it contributes prominently to accidents by leading to heightened risk of accidents [8].Chauvin et al. [9] identifed visibility as a signifcant contributing factor to collision accidents, accounting for 56.51% of contextual factors according to the Human Factors Analysis and Classifcation System (HFACS).Supporting this conclusion, Bye and Aalberg [10] conducted a study that assessed the impact of diferent visibility conditions on accident causation.Teir fndings indicated that the most pronounced efect occurred when visibility dropped below 0.25 nautical miles.Although the current research has successfully examined whether poor visibility is more likely to trigger accidents compared to other environmental variables (such as fow rate and water depth), it has neglected to evaluate the cognitive state changes in pilots when anticipating potential hazards under poor visibility conditions [11].Currently, a widely recognized cognitive framework for SA is the threetiered architecture, which encompasses three stages: perception, comprehension, and projection of the present environmental state into the future [12].Hence, in situations characterized by reduced visibility, the application of the three-tier cognitive framework to precisely evaluate the situation awareness (SA) levels of pilots becomes a critical challenge that requires attention, as pointed out by [13].
Eye-tracking techniques have been used in literature studies to examine the relationship between cognitive states and visual behaviors, including pupil diameter, saccade frequency, and fxation time [14,15].Louw and Merat [16] demonstrated evident scattering of a driver's visual attention during automated driving by simulation experiments.Nonetheless, the internal processes connecting the level of automation to attention were not taken into account.Given this observation, the connection between eye movement measurements and situation awareness (SA) level, as well as the fundamental infuencing factors, has emerged as a central research area in the transportation domain [17].In many works, fxation metrics are linked to SA [17][18][19].Moderate associations between saccade metrics and SA have also been verifed in various studies [20,21].However, no work has identifed a connection between pupil dilation or blink rate and SA [22].It is worth noting that although some studies have utilized multiple eye-tracking metrics [23], not all of them were associated with SA.In general, it is acknowledged that longer time spent by subjects in a specifc area of interest (AOI) indicates higher SA [24].However, a consensus remains elusive concerning the correlation between eyetracking metrics and situation awareness (SA) due to discrepancies in application objectives and task conditions.
Despite the extensive research on the correlation between eye-tracking metrics and SA in various application domains, the problem of SA identifcation based on eyetracking technology remains unresolved.From the aforementioned literature investigation, it is evident that some studies did not consider SA measurement as their primary research goal [25,26] or only included it as one of their research goals [15,17].Only a limited number of studies explicitly mentioned that their primary aim was to establish the link between eye-tracking metrics and situation awareness (SA).Tese studies include those conducted by [23,27].Furthermore, to the best of our knowledge, the direct investigation of eye tracking-based SA recognition approaches is scarce.
Terefore, this study has two main objectives.Te primary aim was to evaluate the correlations between eye movement features and SA.Trough correlation analyses, we confrm signifcant associations between saccade and fxation metrics and SA levels, consistent with previous fndings [28].Te ultimate goal of the present study is to explore an SA recognition approach based on relevant eyetracking metrics, with the hope of reducing pilotage risks by enhancing the selection and training of pilots.With this objective in mind, this paper introduces a novel approach for assessing pilots' SA using eye-tracking metrics.It is anticipated that this method could contribute to the development of an SA assessment technique based on physiological measurements and ofer insights into how to train and enhance pilots' SA.

Experimental Methods
Following the eye-tracking experiment conducted on a bridge simulator, a research framework was developed to detect the correlation between eye movement metrics and SA levels (Figure 1).Initially, it was hypothesized that visual attention is signifcantly associated with SA, particularly in specifc scenarios such as poor visibility.Tis hypothesis has been confrmed in a previously accepted paper, which also specifes the participants, situations, areas of interest (AOIs), and experimental procedures [28].Expanding on the utilization of the eye-tracking technique and the SART questionnaire, the SA level groups were established as independent variables, with eye movement metrics designated as dependent variables.To guarantee the questionnaire's professionalism and measurement accuracy, safety engineering and management procedures were employed for SART questionnaire validation.Tis process included regulatory oversight from maritime authorities and input from seasoned pilots.Subsequently, heatmaps and scan paths were used to illustrate the visual distribution of pilots across various areas of interest (AOIs) as part of an initial cognitive analysis.Permutation simulation was then utilized to verify eye movement metrics that exhibited a signifcant correlation with SA during ship pilotage.Te eye movement metrics showing signifcant diferences were divided into testing and training sets, serving as input for the random forest-support vector machine (RF-SVM) algorithm.Tis innovative approach allows for the categorization of SA levels to screen pilots and has been preliminarily validated in a previously accepted paper [29].
In the aforementioned simulation experiments, the collection of synchronous real-time eye-tracking data was the initial step.However, due to the susceptibility of the eye tracker to illumination and pilots' head movements, the data often contain signifcant outliers, leading to identifcation errors [30,31].To efectively reduce the infuence of noise on performance when employing wearable eye-tracking technology, a SA recognition model was created using the random forest-support vector machine (RF-SVM) algorithm.Tis model consists of modules for data input, SVM, modifed RF, and verifcation [32].RF utilizes a voting integration approach based on decision tree classifer 2 International Journal of Intelligent Systems predictions.As an ensemble learning technique, RF is known for its accuracy and robustness in recognizing noise and outliers [33].On the other hand, SVM is a machine learning approach based on statistical theory that ofers distinct advantages in addressing small-sample recognition and high-dimensional nonlinear pattern recognition challenges [34].Utilizing the RF-SVM approach we introduced, pertinent eye-tracking feature sets can be derived by taking into account the feature importance sequence in RFs as input for SVMs.Te validation module assesses the accuracy in identifying the SA level of pilots, as described by [35].

Data Analysis.
During the data collection procedures, the Tobii Glasses 2, a wireless wearable eye-tracking device, was employed to gather eye movement data from 25 ship pilots.Te pilots engaged in ship piloting tasks in a bridge simulator for a minimum of 40 min, including over 25 min of poor visibility conditions, while being monitored by the eye-tracking device.Tis process unfolded in three stages: frst, the calibration phase of the device was conducted before the experiment; subsequently, the testing phase was carried out throughout the entire piloting task; and fnally, posttest interviews were conducted to validate the SA measurement results.
To analyze eye movement features at diferent levels of SA, preliminary extraction and analysis of eye-tracking features were conducted.Te eye-tracking device was initially used to gather data, including pupil diameter and fxation count and duration, as well as saccade count and duration.Figure 2 displays the eye-tracking data spanning a 50 s duration, while in practice, the device records data at a sampling rate of 50 Hz, resulting in the collection of approximately 2,496,000 samples.However, the limited spatiotemporal sampling capabilities of eye-tracking devices pose restrictions on acquiring visual information from the peripheral environment.To address this, interpolation was employed to fll in missing data points, and noise reduction was achieved through fltering.After fltration, outliers were efectively removed from the gaze data fragment, which can be found in Figure 2.
Ten, there was a fast drop in fxation accuracy upon deviation of the sight line from the central vision feld.Accordingly, the gaze frequency-based recognition of eyetracking types was accomplished by the velocity-threshold identifcation (I-VT) approach [36].Te gaze data underwent classifcation using the I-VT algorithm, which segregated the data into fxation samples (below a certain threshold) and saccade samples (exceeding the threshold).Te coordinates of the fxation samples were determined in relation to the visual perspective of the subjects.Gaze data with fxation times ranging from 50 to 600 ms and a frequency range greater than 3 Hz were selected.Te threshold for setting the velocity of ocular movements was established at 30 °/s.In this way, visual behaviors could be identifed with International Journal of Intelligent Systems the utilization of eye-tracking data based on registered coordinates, which, as displayed in Figure 2, stood for smooth fxation and fuctuant saccade coordinates.In addition, since the gaze data location was evident, the quantity and duration computation of fxations and saccades was possible.
With respect to pupil processing, the lowest acquisition parameter of the eye-tracking device was assigned as a default 2 mm value.In addition, linear interpolation was applied to address the presence of a few extreme or missing values in the raw data [37].Figure 3 presents the processed data.On the whole, through integration of the eye-tracking devices' output types and the identifcation and calculation techniques, the classifcation of signal features into fxation, saccade, and noise types was possible.Noise constituted 16.8% (equivalent to 87.6 minutes) of the total experimental duration, falling within the expected range.

Modifed RF Module.
RFs, a machine learning approach, are efcient tools for classifcation and assessment tasks [38].In this investigation, training subsets were created using the bootstrap sampling method from preprocessed samples that had undergone interpolation and fltration [29].For each subset (S 1 , S 2 , . . .,S k ), a decision tree model was established, and the classifcation outcomes were assessed using the majority voting principle.Te RF algorithm aimed to create a decision tree that depended on a stochastic variable θ, using the data sample X and the recognition variable Y as inputs.Suppose that the recognition result of a single decision tree classifer h(x, θ k ) is h i (X).Te formula for the model's fnal recognition result is as follows: To evaluate the importance of features in RF, the inclusion of noise in a specifc feature was considered, and a signifcant decrease in recognition accuracy was taken into account [39].Te importance of eigen-parameters was assessed using the residual mean square for the out-of-bag (OOB) score during the computational process.Te formula for calculating the OOB score is as follows: Te formulas for errOOB1 and errOOB2 represent the OOB recognition errors before and after introducing noise interference into all sample features.Model stability was guaranteed through a grid search methodology aimed at determining the most suitable variables.To avoid overftting of the feature data, we employed the root mean square error (RMSE).Te relevant computational formula is provided as follows: where y obs and y pred separately denote the observed and forecasted values of the corresponding samples, respectively, and n stands for the sample quantity.In addition, to address the potential issue of partial information overlaps that may arise from initial feature correlations, the optimal combination of features was obtained using principal component analysis (PCA) as follows: where No correlation is noted between the 2 principal components, that is, F i ≠ F j (i ≠ j; i, j � 1, 2, . . ., z).Te initial principal component, denoted as F 1 , signifes the linear combination that demonstrates the most signifcant variation among the entire sequence of initial importance values (X 1 , X 2 , . . .,X z ).On the other hand, the second principal component, F 2 , is a linear combination of X 1 − X z , independent of F 1 , and demonstrates the next-largest diference from the initial sequence.Likewise, the remaining principal components were identifed as new input sets for the SVM.

SVM Module.
SVMs are commonly utilized to address binary classifcation problems involving limited and nonlinear samples [40].In the integration model for SVM, the primary goal was to achieve optimal identifcation results by employing a voting mechanism of multiple base classifers with diferent feature combinations.Te data ftting process of SVM approached the challenge of feature optimization within the identifcation framework through three key steps: feature identifcation, validation of kernel functions, and parameter optimization of the base classifers.Te adapted RF approach, which integrated the PCA algorithm, enabled the transfer of importance rankings and the refnement of the input set.
Step 1. Feature identifcation: Te input samples F � (x i , y i )|i � 1, 2, . . ., F   were mapped to a feature space using a nonlinear function, and the optimal classifcation hyperplane that minimized the distance from all samples to this plane was constructed.In ideal circumstances, the linear classifcation function can be represented as follows: where w T is the weight, φ(x i ) is the kernel function, and b is an ofset term.Evaluating the precision of the classifcation procedure is of utmost importance.To minimize the error, an insensitive loss function with two relaxation factors (ξ and ξ * ) was introduced.
Assuming that all training samples are completely separated with an accuracy of ε, the classifcation condition is as follows: To optimize the SVM classifcation function, equation ( 6) was reformulated as a minimization problem.Subsequently, the Lagrange equation method, along with the dual principle, was utilized to transform the minimization issue into a dual optimization problem.Te expressions for the minimization and dual optimization problems are as follows: International Journal of Intelligent Systems where C represents the penalty coefcient and α i , α * i , α j , and α * j are Lagrangian multipliers.
Step 2. Validation of the kernel function: In practice, the training samples often do not meet the requirement for linear separability.To address the classifcation of nonlinear features, a kernel function was introduced to map the training samples from the original space to a high-dimensional Hilbert feature space.Tis process enabled the determination of a discriminative hyperplane with the greatest margin between categories.By reducing the feature dimensionality, a nonlinear classifcation boundary could be established.In this investigation, the selected kernel function was the Gaussian radial basis function (RBF).Tis ofers good generalization ability and has the kernel bandwidth (σ 2 ) as the main parameter.Te RBF can be expressed as follows: Step 3. Parameter optimization of SVM and voting integration: Te two important parameters of the Gaussian RBF kernel are the penalty coefcient C and the kernel bandwidth σ 2 .To enhance the learning and generalization capabilities of the SVM method, these parameters were optimized using a grid search approach.Consequently, the SVM model obtained, based on the optimal parameters, yielded the classifcation equation shown in equation (9).Te fnal identifcation result was determined through the voting integration of the base classifers.
2.4.Verifcation Module.Te test dataset, containing samples with indeterminate categories, produced a confusion matrix that included true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) as the model's output.To assess the recognition performance of the R-SVM approach, a comparison was made with RF and SVM without optimized feature combinations.Te assessment criteria for each classifer included the true positive rate (TPR), true negative rate (TNR), and general accuracy (ACC).Te computational formulas for these metrics were as follows: In these formulas, TP represents samples where both the observed and forecasted values are 1.FP and TN refer separately to the samples with observed values of 0 and 1 and predicted values of 1 and 0. FN represents samples where both the observed and forecasted values are 0.

Results
To construct the SA recognition model using eye-tracking metrics, groups with diferent levels of SA were initially created.For the purpose of analysis, pilots' SA was hypothetically divided into two levels on the basis of their SART score: high (>mean SARTscore) and low (<mean SARTscore).We divided the pilots into either the high-SA group (n � 13; mean � 24.5; and standard deviation � 5.13) or the low-SA group (n � 12; mean � 15.2; and standard deviation � 4.37) depending on the SART scores (mean � 20.13 and standard deviation � 5.83).

AOI Analysis.
To compare the allocation of attention and scanning techniques among subjects in diferent situation awareness (SA) groups, we conducted an initial assessment of pilot visual behavior under conditions of poor visibility.Tis assessment was based on thermograms and scan paths of eye-tracking metrics.Te thermogram evaluation revealed two key fndings: during marine pilotage, participants primarily focused on the electrical chart (AOI-1) and outside the window (AOI-2).Notably, signifcant diferences were observed in the time spent fxating on the same AOIs across diferent SA groups, as illustrated in Figures 4(a) and 4(b).Specifcally, participants in the low-SA group allocated more attention to AOI-1, while those in the high-SA group directed their focus more toward AOI-2.
Tese observations indicate a potential link between the selective distribution of attention between diferent AOIs and the SA level of pilots.
Furthermore, the fndings of the scan path analysis showed that AOI-2 was the main area where pilots scanned back and forth.Interestingly, the high-SA group exhibited a signifcantly higher scanning frequency than the low-SA group (Figures 4(c) and 4(d)).Tis can be attributed to the fact that in real-life situations, the high-SA population requires frequent confrmation of perceptual elements and timely updates of their mental model to accurately forecast and anticipate behavior.Taken together, these fndings provide evidence for a probable relationship between scanning strategies and the level of SA in poor visibility situations.

Correlation Evaluation.
To avoid confusion about the infuence of varying AOIs on the signifcant disparities in the eye-tracking metrics and SA levels of pilots, a statistical evaluation was undertaken by emphasizing AOI-1 and AOI-2.We selected these two AOIs because they represented the main points of fxation and saccadic movement during the poor visibility scenario, as indicated by the eye-tracking 6 International Journal of Intelligent Systems metrics' descriptive statistics.Te association between eyetracking metrics and SA levels in AOI-1 and AOI-2 was analyzed individually using permutation simulations, as outlined in Table 1.
Considering the adverse infuence of reduced visibility on pilots' ability to gather real-time environmental information through scene scanning, it is worth mentioning that the group with low SA exhibited the longest average fxation duration in AOI-1.Tis suggests that the low-SA group might have concentrated their attention on acquiring primary perceptual information in AOI-1.In contrast, the average fxation time of the high-SA group was longer in AOI-2 than in AOI-1, indicating that the priority of this group was still the necessary feedforward information processing in AOI-2.Considering the correlation evaluation between AOI-1 and AOI-2 under the poor visibility scenario, the foregoing fndings provide potential for future exploration into the SA level detection of pilots with the utilization of relevant fxation and saccade metrics in such a setting.In addition, data segmentation was performed by utilizing a sliding time window, set at 5 s (i.e., epoch length) to minimize data noise and volume.Te computational results obtained from this segmentation were regarded as the features for selection in the subsequent recognition model, which can be found in Table 2.

SA Identifcation.
In the present research, the state of cognition was categorized into high-and low-SA levels by introducing a nonlinear RF-SVM algorithm, which exploited the eye-tracking data derived from a marine pilotage experiment.Following the data preprocessing via fltration and interpolation, the classifcation of features proceeded into 9 groups, as detailed in Table 2. Next, the characteristics were assigned to the testing and training sets, with the letter accounting for 75% of the whole samples.Te RF-SVM approach was formulated in accordance with this training set.Te best parameters for the RF model were identifed through the grid search method, as depicted in Figure 5.A maximum search efciency of 0.9252 was achieved by confguring the model with 151 estimators and a maximum depth of 20.Subsequently, verifcation of optimal RF variables was carried out; in addition, importance ranking of initial features was accomplished according to the average score (Table 3).
Te feature with the minimum importance score in Table 3 was discarded so that the recognition model overftting could be avoided.Subsequently, we fed back the remaining features into the RF model.Te repetition of such a feature screening process was accomplished for 8 iterations, and the RMSE and relative error (RE) of 1-8 features were estimated, as detailed in Table 4.According to the obtained fndings, the RMSE and RE were minimized when using six features.Consequently, the features ND, SC, and MN were excluded, leaving six valid features.Furthermore, to address potential information overlap in the visual features from the same participants, dimension reduction was performed using the PCA algorithm.Table 5 lists the eigenvalues for the correlation coefcient matrix, as well as the contributions of various principal components.Te initial four principal components were chosen as the extraction criterion, as they collectively accounted for over 90% of the variance rate.Tese four principal components are as follows:   The number of estimators  International Journal of Intelligent Systems Extraction of principal components (F i ) was accomplished through feature combinations.New input sets were generated using these principal components for the SVM method.Te optimization of SVM parameters was performed using the grid search technique, and the results are illustrated in Figure 6.A maximum efciency score of 0.9328 was attained with C � 6 when the bandwidth g of the kernel function was 0.005.After calculating the base classifers for the SVM, the optimal identifcation result was obtained through voting integration.
By comparatively analyzing RF-SVM, RF, and SVM, the optimized feature combinations were assessed for validity with 3 performance evaluation metrics.Figure 7 illustrates the distributions of classifcation precision measures for the 3 classifers that vary in performance metrics.Te central mark in each box indicates the median, with the boxes representing the 25th and 75th percentiles.When utilizing the best eye-tracking feature combination in the RF-SVM algorithm, the average accuracy (ACC) across 100 iterations reached 0.934, with a TNR of 0.875 and a TPR of 0.940.Overall, the RF-SVM outperformed the SVM and RF models without feature optimization.
Since receiver operating characteristic (ROC) curves can be plotted to visualize the accuracy of the classifcation method, the ROC combined with the area under the curve (AUC) is usually used to solve the evaluation problem of binary classifcation [41].Te RF-SVM algorithm was retrained and retested using the optimal parameters, and the ROC graphs with the AUC were analyzed for comparative purposes.Figure 8 illustrates the results, with an AUC score of 0.894 for RF, 0.907 for SVM, and 0.942 for RF-SVM, indicating the higher stability of RF-SVM.TPR was used to confrm the superior sensitivity of RF-SVM, while its specifcity was supported by the TNR.Performance International Journal of Intelligent Systems information for the three classifcation algorithms based on the assessment methodology is presented in Table 6.Te RF-SVM algorithm, using optimized features as input data, achieved an average accuracy of 0.934, an average sensitivity of 0.940, an average specifcity of 0.876, and an AUC score of 0.942.Te obtained fndings suggest that RF-SVM, with optimized parameters, outperforms classical models in recognizing eye movement features associated with diferent levels of SA.Tis provides valuable insights for developing a screening and assessment model for pilot competency.

Discussion
Despite the numerous advantages of physiological measures, such as uninterrupted and objective properties, in comparison to direct subjective assessments, there have been limited studies investigating the assessment of SA in pilotage tasks using physiological measurement techniques.In this study, we explored an SA recognition method for pilots based on eye-tracking data gathered during a bridge simulation experiment.Te SA recognition model based on the  5).In this study, eye-tracking data were collected from 25 pilots during low-visibility scenarios to evaluate their competency.Te signifcance of the correlations between visual behaviors (represented by thermograms and scan paths) and SA level (determined by SART) was quantifed through a permutation test.Relevant eye-tracking features derived from pertinent metrics and divided into 5-second segments following interpolation and fltration to prevent overftting.Te selection of the top 6 signifcant features was based on their RE and RMSE, and feature combinations in the RF model were determined through PCA.Te next step was precise RF-SVM-based identifcation of the at-risk cognitive state, that is, a low-SA level.As revealed by the comparison among the 3 performance metrics, the performance of our model was superior to that of the remaining 2 models that were devoid of feature optimization.
Te fndings of the current work demonstrate that effective identifcation of the at-risk cognitive state is possible by utilizing eye-tracking features within a well-designed recognition framework that ofers high accuracy, sensitivity, and specifcity.Figure 7 presents the recognition results obtained from RF-SVM, RF, and SVM utilizing diferent types of features.It is evident that the classifcation accuracy signifcantly improved to 93.43% (RF-SVM) when we used a combination of the top 6 most signifcant features, in contrast to 86.79% for RF without feature optimization.When the same feature data were employed as inputs, a comparison between RF and SVM revealed that SVM outperformed RF in terms of ACC, TPR, and TNR.Tis highlights the suitability of SVM when dealing with a smallsample size of nonlinear eye-tracking data in a low-visibility environment.Hence, an efective RF-SVM model for detecting the at-risk cognitive state needs to comprise a strong classifer (i.e., SVM module), as well as the input data optimized with a combination of salient eye-tracking features (i.e., modifed RF module).
It should be noted that a precise comparison of results with other studies that utilize eye-tracking is challenging due to variations in the simulation approach, eye-tracking feature selection, and classifcation of SA groups across different studies.In addition, a limitation of this study is the performance issue with the eye-tracking devices.In situations where pilots exhibit rapid movements, data collection may be hindered by the slower acquisition speed of the eyetracking device, leading to a reduction in local data.While data interpolation can assist in addressing this concern, it may inevitably afect the signifcance of correlation results.Moreover, in terms of feature extraction, the accuracy of recognition is infuenced by the duration of the data segments (epochs) and the correlation metrics of eye-tracking data in conditions of poor visibility.Moreover, the present study provides evidence that ship pilots with high levels of SA tend to exhibit more active visual behaviors, as indicated by nine eye-tracking features with epoch lengths of 5 s.However, as previously demonstrated in other studies [42][43][44], SA levels are not only associated with eye movement features but also with other physiological measurements.Terefore, recognizing at-risk cognitive states is a complex process that warrants further investigation, considering the fusion metrics including    [29].
With the development of sensing technology, the optimization of portable devices provides the possibility of data acquisition and fltering for the pilots during the actual tasks [45].Te current research results are a prior exploratory study of using physiological indicators to monitor the pilotage process, and subsequent research on behavioral pattern recognition by fusing EEG can provide technical support for auxiliary decision-making of intelligent navigation.

Conclusion
Situation awareness (SA) plays a crucial role in marine safety, as the lack of SA can contribute to approximately 75% of maritime accidents caused by human error.In contrast to direct SA assessment methods, physiological measurement techniques, such as eye-tracking, have the potential to provide objective and continuous SA evaluation in pilotage tasks.Nevertheless, it remains unclear how to infer SA using eye movement features.Tis study conducted a bridge simulation experiment for ship piloting to examine the relationship between eye-tracking features and SA, as assessed by a SART.In addition, we developed an RF-SVM model to identify pilots' SA based on eye-tracking metrics.
Te results obtained with our RF-SVM algorithm provide evidence for its efectiveness in recognizing at-risk cognitive states, specifcally low levels of SA, based on eye-tracking features.Our identifcation model, which includes modifed RF, SVM, and verifcation modules, was applied to eye-tracking data collected from 25 ship pilots during a bridge simulation experiment conducted under poor visibility conditions.By employing permutation simulations and PCA for RMSE and RE rectifcation, we determined that the optimal eye-tracking features consist of four principal components.Te performance of our proposed feature combinations surpasses that of RF and SVM without feature optimization, revealing the potential of our approach in computer-assisted screening of cognitive competency for ship pilots.Te outcomes of this study could establish a theoretical foundation for the utilization of physiological measurement techniques in assessing situation awareness (SA) and, in turn, aid in the monitoring and evaluation of pilots' competencies.Moving forward, it will be important to validate and apply our recognition model in various emergency scenarios, such as ship departure, anchoring, and encounters, using multiple fusion metrics such as HRV and EEG.By doing so, we can not only beneft from immediate improvements in cognitive state surveillance and prevention of unsafe behavior but also pave the way for developing comprehensive systems to assess the physical and mental competencies of ship pilots.

Figure 1 :
Figure 1: Framework for SA level recognition in maritime pilotage simulations.

Figure 2 :
Figure 2: Processed measurement fragment for the fxation position.

Figure 3 :
Figure 3: Processed measurement fragment for the pupil diameter.

Figure 4 :
Figure 4: Visual behaviors in poor visibility.(a) Heatmap of high-SA group.(b) Heatmap of low-SA group.(c) Scan path of high-SA group.(d) Scan path of low-SA group.
in 5 s (MF) Fixation duration in 5 s (FD) Fixation count in 5 s (FC) Saccade Median saccade duration in 5 s (MS) Saccade duration in 5 s (SD) Saccade count in 5 s (SC) Noise Median noise duration in 5 s (MN) and noise duration in 5 s (ND) Noise frequency in 5 s (NF)

Figure 5 :
Figure 5: Parameter optimization of the RF method.

Figure 8 :
Figure 8: ROC curve of the three methods.

Table 1 :
Correlation results in poor visibility.p < 0.1 and b p < 0.05.Te bold values given in Table 1 indicate p values corresponding to eye-tracking metrics associated with SA levels. a

Table 2 :
List of calculated features.

Table 3 :
Initial feature importance score and ranking.

Table 4 :
Error values with varying feature quantities.

Table 5 :
Principal components of features.

Table 6 :
Comparative analysis of performance metrics.EEG) and heart rate variability (HRV) signals with varying epoch lengths, as well as the classifcation of SA groups using diverse approaches