Trustworthiness Assessment for Crowdsourcing-Based Citywide Parking Availability Sensing via Connected and Automated Vehicles

Real-time status acquisition of parking spaces is highly valuable for an intelligent urban parking system. Crowdsourcing-based parking availability sensing via connected and automated vehicles (CAVs) provides a feasible method with the advantages of high coverage and low cost. However, data trust issues arise from incorrect detection and incomplete information. Tis paper proposes a trustworthiness assessment method for crowdsourced CAV data considering diferent impact factors, such as the distance between the CAV and the target parking space, line abrasion, scene complexity, and image sharpness. Te crowdsourced CAV data are collected through extensive feld experiments and PreScan simulations. Te classical line detection algorithm of VPS-Net and the target detection algorithm of YOLO-v3 are applied to detect on-street parking availability. A failure probability model based on the XGBoost algorithm is then developed to establish the relationship between data trustworthiness and diferent impact factors. Te results show that the proposed model has an average accuracy of 78.29% and can efectively assess the degrees of external infuences on the trustworthiness of the crowdsourced data. Tis paper provides a new tool to identify the data quality and improve the sensing accuracy for a crowdsourcing-based parking availability information system


Introduction
Due to the rapid increase in car ownership and trafc demand, most cities are confronted with intractable problems such as urban trafc jams, parking resource shortages [1], and environmental pollution [2].Real-time status monitoring of on-street parking spaces is an essential foundation and premise for solving urban parking problems caused by unbalanced supply and demand.Over the past decade, several efective measures have been developed to alleviate searchingfor-parking trafc by improving parking resource use efciency, including advanced parking management strategies [3], parking reservation and dynamic allocation [4], parkand-ride [5], automated valet parking [6,7], and cloud-based centralized parking dispatching [8,9].Hence, efcient and reliable methods for sensing on-street parking availability are critical for the quality enhancement of parking services and the digital transformation of urban trafc management.
Accurately sensing on-street parking status has remained difcult during the last few decades.Most research uses data from specially deployed in-ground sensors in parking lots or garages [10,11].Based on the recent advances in sensing and communication technologies, an increasing number of researchers use wireless sensors such as light sensors, distance sensors based on infrared or ultrasound, magnetometers, and even combinations of diferent sensor devices [12].However, a huge amount of asphalt digging is required for in-ground sensors, especially loops, which require intrusive installation [13].Due to its scarcity caused by frequent use, the installation and maintenance of asphalt have become relatively expensive.Furthermore, in-ground sensors have limitations that must be considered during the design phase, including the requirement for charging by cables or batteries [4].Rather than specially deployed sensors, some studies use data from the already deployed infrastructure, such as on-street parking payment management systems and parking meters [14].However, missing data have consistently been a noticeable problem in existing methods due to underpaid/overpaid and unpaid transactions.Additionally, this approach depends on the scope of the deployment, and a domain-wide deployment still requires signifcant overheads.
Fortunately, connected and automated vehicles (CAVs) have progressed considerably in recent years [15][16][17][18].Sharing is a major development direction in the future [19,20].Crowdsourced data collected by CAVs are widely used in the scientifc literature, and practical applications related to intelligent transportation demonstrate the enormous potential for sensing parking availability status [21].Unlike roadside devices [22,23], onboard units (OBUs) such as onboard millimeter-wave radar and vehicle-around-view monitors could be more convenient and economical with high update frequency and broad coverage [24].Study [25] is a typical solution for citywide parking availability sensing based on a feet of taxis.Parking detection sensors are installed on cabs; these can collect information on the availability of parking spaces and show that crowdsourced data have high suitability as a source for sensing parking status.Meanwhile, some studies demonstrate that the crowdsensing solution would require a signifcantly smaller number of sensing units than a fxed sensing system [26].Terefore, due to considerations of cost-efectiveness and reliability, onboard video sensors become the preferred choice.
However, crowdsourced data introduce new issues and challenges.Te most signifcant of these problems is the inferior data quality caused by many external factors such as road scenarios, environmental conditions, facility status, and sensor capability [27][28][29][30].Data quality can directly afect the accuracy of on-street parking status monitoring.Meanwhile, CAV trajectories are highly random in space and time, exacerbating the impact of data quality on the reliability of estimates due to the uncertain coverage and update frequency.Accordingly, evaluating the trustworthiness of crowdsourced data from CAVs is highly valuable to avoid incorrect judgments arising from the excessive trust.Terefore, this paper focuses on the image data obtained from onboard video sensors.It aims to reveal the mechanism of diferent external factors on parking detection accuracy and propose an assessment method for realtime status monitoring of wide-area parking availability based on crowdsourced data.Te main contributions of this paper are summarized as follows: (i) A trustworthiness assessment framework is proposed for crowdsourced CAV data through systematic simulation experiments, and four environment-related factors afecting the parking detection algorithm are introduced, laying the foundation for improving the accuracy of parking availability detection in an urban context.
(ii) A failure probability prediction model of parking availability sensing is developed based on the XGBoost algorithm, which can quantitatively reveal the infuence of diferent factors on the data trustworthiness of CAVs.
Te organization of this paper is as follows.Section 2 introduces the related work, including methods of parking availability detection and the data quality of current imagebased object detection methods.Section 3 describes the proposed trustworthiness assessment framework.Section 4 analyzes the external factors for the detection model.Section 5 presents the study results through an XGBoost-based trustworthiness assessment model for single vehicle detection.Finally, Section 6 summarizes the fndings and future work.

Related Work
2.1.Parking Availability Detection.Several technical routes exist for citywide parking space detection.A typical method is based on electromagnetic induction.For example, a loop detector determines the occupation of parking spaces using an electromagnetic feld with a quantifable inductance; the feld is interrupted and the inductance is reduced when vehicles pass the loop [31].Magnetic sensors detect parking spaces through the magnetic variations caused by the presence of vehicles [32].Alternatively, by comparing the counts between two magnetic sensors installed along the pathway, the number of vehicles between them can be obtained [33].Similarly, piezoelectric sensor detection depends on the induced electric energy resulting from the substance vibration or mechanical stress [34].However, these methods are vulnerable to being afected by surroundings.Magnetic sensors can be infuenced by large metal objects nearby.Ultrasonic and infrared sensors are sensitive to temperature and air pressure.Pneumatic tubes sufer from stress.Even for an inductive loop, the sonar and microwave detectors are sensitive to the vehicle's speed because they fail to detect slow or stationary vehicles [35].
Another popular technical route is leveraging the imagebased solution [36,37].Video sensors can detect multiple spaces simultaneously and provide wider area monitoring than other sensors [27].Additionally, video sensors ofer relatively low cost due to their easy installation, operation, and maintenance [38].Maeda and Ishii [39] compared collected images with reference images using a normalized principal component of feature characteristics; however, obtaining and updating the reference images are difcult.Some studies use the typical shape of car elements for detection, but this requires many pixels per vehicle.Yamada and Mizuno [40] proposed an approach to detecting vehicle presence with grayscale images.Tey fragmented each image region corresponding to a cell through density and analyzed the segment area distribution.Barofo et al. [41] provided a method depending on the hue histogram and linear support vector machine (SVM) with high accuracy.In recent years, with advances in deep learning, some researchers have introduced this to parking space detection.Fan et al. [42] 2 Journal of Advanced Transportation applied deep learning to parking space detection tasks and proposed various neural network-based models, including the multistep long short-term memory recurrent neural network (LSTM-NN) model [43].Feng et al. [44] introduced a hybrid deep learning framework called dConvLSTM-DCN, designed for accurate prediction of short-term and long-term vacant parking space availability within a region, and developed an intelligent parking guidance system using a deep gated graph recurrent neural network (G2RNN) [45].Regarding image-based methods, Zhang et al. [46] proposed a method based on DCNN with YOLO-v2 to detect marked points in images.As image data are more complex than other data, Zinelli et al. [47] used an RCNN-based framework to adapt to various conditions.However, RCNN strongly depends on object proposals.Additionally, Suhr et al. [48] used CNN to detect parking spaces in combination with global information and the attributes of the parking spaces.Nurullayev and Lee [49] proposed a method based on a dilated convolutional neural network specifcally designed for detecting parking spaces.Tese methods still sufer from low recognition rates, sensitivity to environmental changes, and weak generalization.To address these problems, Xu and Hu [50] proposed the YOLO-v3-based VPS-Net, the detection method adopted in this paper.

Data Quality of Image-Based Object Detection.
Data quality is relatively essential to real-time monitoring of parking availability status based on crowdsourcing data.Bock et al. [25] reported that an inaccurate detection result strongly infuences the sensing of parking availability status and applied Kalman flters to overcome this issue.Extensive research has been conducted on the factors afecting the accuracy of identifcation results.Dorafshan et al. [51] suggested that edge clarity can impact crack identifcation and degrade accuracy for challenging settings, such as low lighting conditions, the presence of shadows, and lowquality cameras.Huang et al. [52] assumed that the interference of various types of objects in the picture and the intensity of light necessarily afect the performance of object detection.Zhu [53] indicated that identifying road trafc conditions would be infuenced by various factors, including weather and road condition factors.Tabernik and Skočaj [54] proposed that occlusion, brightness, color alteration, distortion, and skew occurring in the background can pose a risk to object detection.Dewi et al. [55] demonstrated that the target size impacts the accuracy of image recognition.
For parking space detection, some studies also ofer relevant infuencing factors.For example, Amato et al. [56] showed that obstacles such as lampposts and other cars are closely related to detection accuracy.Ling et al. [57] demonstrated that image data from car parking spaces are sensitive to lighting and weather conditions.Yamada and Mizuno [40] demonstrated that the surface of the parking space would infuence the detection results, especially for poor-condition white mark-of lines.Tang et al. [58] showed that deep learning models for parking space recognition are subject to variable environments, such as illumination changing, occlusion, and weather.Ichihashi et al. [59] proposed that weather, such as raindrops, can cause the camera to become distorted and make the sharpness of the image less clear, thus afecting the performance of camerabased vehicle detector for parking lot.Zaidi et al. [60] indicated that there are many reasons, such as occlusion, lighting, pose, and perspective, that can pose a challenge to the detection of neural networks.However, previous studies have considered a single factor, limited to the image quality or content.Conversely, this paper combines these two factors, considering the efects of the image and the surrounding environment in which it was taken on monitoring car parking images under diferent lighting conditions.

Trustworthiness Assessment Framework
3.1.Image-Based Parking Availability Detection.Te parking space detection algorithm is the key to parking space status sensing.Its function is to accurately identify on-street parking spaces and determine the occupancy status of parking spaces.Te image data collected by the surroundview camera cannot completely encompass the four endpoints comprising a parking space.Only the two closer endpoints comprising the entrance line of the parking space can be obtained, which cannot infer the type of parking space.Terefore, as shown in Figure 1, the classical VPS-Net algorithm [61] is applied to identify the outer entrance line endpoints of parking spaces through image grayscale processing and estimate the other two endpoint locations and the type of parking spaces.Meanwhile, a YOLO-v3 pretrained detector is used to detect and classify all marker points and parking space endpoints.Target detection is then performed based on matching pairs of marker points with geometric information.Te reliability of the results of parking occupancy classifcation is enhanced by a deep convolutional neural network (DCNN).

Identifcation of Parking Space Endpoints and Entry
Lines.Assuming that the two identifed endpoints p 1 and p 2 meet certain confdence requirements, that is, they can form an efective entrance line of a parking space if it is the entrance line of a parallel parking space, t 1 ≤ ‖p 1 p 2 ‖ ≤ t 2 ; if it is a vertical or an inclined parking space, t 3 ≤ ‖p 1 p 2 ‖ ≤ t 4 .Te parameters t 1 , t 2 , t 3 , and t 4 are based on a priori knowledge of the entrance line lengths of the diferent parking spaces.
After the endpoints satisfy the distance constraints, forming a valid parking space entry line may still be impossible.Tis can be solved by classifying local image patterns formed by two endpoints into predefned classes.A local coordinate system is established with the origin at p(x, y), the midpoint of p 1 (x, y) and p 2 (x, y), and p 1 p 2 ����→ as the X-axis.Te rectangular region R is defned in the coordinate system.Its length along the X-axis is w 1 � ‖p 1 p 2 ‖ + Δw, and its length along the Y-axis is h 1 .Te calculation equation is as follows: where Δw and Δh are hyperparameters controlling the width and height of the rectangular region.

Complete Parking Space Deduction.
Te complete parking space is obtained by deduction based on geometry and prior knowledge, as the video (picture) collected by the surround-view camera tends not to show the parking space completely.Each parking space comprises four points p 1 , p 2 , p 3 , and p 4 .Here, p 1 and p 2 are the two endpoints comprising the entrance line of the parking space, and p 3 and p 4 are the other two endpoints not covered in the image, whose coordinates can be calculated as follows: where α i and d i denote the angle and depth of the parking space; α 1 is the angle of the vertical and parallel parking spaces; d 1 and d 2 are the depths of the vertical and parallel parking spaces, respectively; d 3 is the depth of the inclined parking space; and α 2 and α 3 are the angles at an acute or obtuse angle, respectively.Te parking space detection algorithm distinguishes the parking space occupancy based on diferent color markings.A green rectangle is marked when the parking space is identifed as free, and a red rectangle is marked when the parking space is identifed as occupied.(i) Correct detection: when a car is in the identifed parking space and the parking space is occupied or when no car is in the identifed parking space and the parking space is empty.(ii) False detection: when a parking space is empty and recognized as occupied by the parking space detection algorithm or when a parking space is occupied by a car but detected as empty.
(iii) Missed detection: the parking space is not detected when multiple frames containing it are not detected.
(5) Since the frame rate of the onboard sensor is set to 10 Hz, a parking space contains multiple consecutive detection pictures, meaning a parking space has multiple detection results.When the detection results are consistent, the results can represent the current occupancy status of a parking space.Te status of the parking space is marked as occupied regardless of whether the space is occupied when the results displayed by multiple frames of detection pictures are inconsistent to avoid cruising caused by system indication errors.

Defnition and Classifcation of External Factors.
Te parking detection method requires identifying parking space endpoints and entry lines.Te quality of the captured images, including the clarity of the parking space line endpoints and entrance lines, impacts the identifcation [62,63].Based on the principle of identifcation, four main factors can be identifed that afect the imaging of parking space endpoints and entry lines: (1) roadside distance, (2) line abrasion, (3) scene complexity, and (4) image sharpness, as shown in Figure 5.

Roadside Distance (D).
Roadside distance (D) refers to the vertical distance between the vehicle with the surroundview camera and the entry line of the on-street parking space.Tis afects the size of the image captured by the surround-view camera and the degree of image edge distortion.

Line Abrasion (A).
Line abrasion (A) refers to the degree of missing or faded white lines of parking space entry due to vehicle movement, weather, and other reasons.Te parking space identifcation detection model outlines the complete parking space by identifying the two endpoints of the entrance line and estimating the locations of the other two endpoints.Terefore, the abrasion of the parking space entry line signifcantly impacts whether the on-street parking space can be accurately identifed.

Scene Complexity (C).
Scene complexity (C) refers to the composition of trafc elements constituting the trafc fow close to the parking space.Tis afects the identifcation accuracy of the parking space identifcation when pedestrians, electric vehicles, and other trafc elements encroach on the two endpoints of the parking space entry line.For the complexity of individual parking spaces, a multivariate linear model is defned as follows: where C i denotes the trafc complexity of the i-th parking space; r denotes the obstacle category; ω r denotes the weight coefcient corresponding to that obstacle category, proportional to the single footprint; and k r denotes the number of obstacles in that category.Te obstacles are divided into three categories in the PreScan simulation experiment to better simulate realistic scenarios: (a) pedestrians (0.4 × 0.7 m); (b) electric vehicles (2.3 × 0.8 m); and (c) boxes, barricades, and so on (1.0 × 1.0 m), and assigned weights according to the footprint, as shown in Table 1.
Terefore, the trafc complexity of a single on-street parking space is calculated as follows: where x is the number of pedestrians close to the parking space; y is the number of electric vehicles; and z is the number of boxes, barricades, and so on.Te overall trafc complexity is defned as follows: where n indicates n parking spaces in the model and n takes a value of 200 in this paper.Te number of trafc elements in the parking space is extracted from the image by semantic object color mapping in the image segmentation sensor in PreScan, as shown in Figure 6.After semantic segmentation, diferent classes of objects are represented by diferent colors, and the number  of diferent obstacles in each parking space can be directly output for the complexity calculation.

Image Sharpness (S).
Image sharpness (S) refers to the quality of the captured images.Te image obtained by following the car is blurred in the scene due to stains on the camera, slow focus speed, and so on.Te parking space identifcation algorithm cannot detect the two endpoints of the parking space entry line when the dataset is overly blurry.Terefore, the image sharpness of the collected image dataset is also an important factor afecting detection accuracy.

Journal of Advanced Transportation
Te factors were classifed based on the defnitions to further quantify and compare the impact of each factor on the accuracy of identifying on-street parking spaces, and the classifcation criteria are shown in Table 2.

Failure Probabilities Corresponding to Diferent
Combinations.Te application scenarios are frst divided into four categories: (1) normal light on sunny days, (2) weak light at night, (3) harsh light on sunny days, and (4) rain and fog with low visibility.We then combine each application scenario (weather conditions) and the four infuencing factors.Te overall identifcation failure probability of the 200 parking spaces included in the simulation model corresponding to each combination is presented as a dataset.Te failure probability refers to the percentage of 200 parking spaces subject to false and missed detection under diferent application scenarios and factors.Each application scenario corresponds to a total of 2 4 � 16 diferent combinations of factors and their corresponding failure probabilities, resulting in 64 data items for the four application scenarios.Te collected data are shown in Table 3, taking normal light on sunny days as an example.Te values for each factor are determined according to the classifcation criteria in Table 2.
Figure 7 shows the statistics and visualization of the failure probabilities corresponding to diferent combinations of factors in four application scenarios.Figure 7(a) shows the probability of failure corresponding to each of the 16 combinations of infuencing factors in the four application scenarios, while Figure 7(b) better compares the impact of the four application scenarios on detection.
Te failure probability can refect the application efect of the parking space identifcation model in diferent scenarios.Te failure probabilities obtained for normal light on sunny days and weak light at night are roughly similar for diferent combinations.In the application scenario of rain and fog with low visibility, the failure probability of parking space detection was high for all 16 cases, and the failure probability reached 60% for individual cases.Te white fog may weaken the strong contrast between the white entry line and the road color in the overhead-view pictures taken on rainy and foggy days.Te diference between the white entry line and the road color after graying out is unclear.Te algorithm cannot determine the parking space entry line, resulting in detection failure.Tus, the model cannot currently be applied successfully to the scenario.In summary, the algorithm should be applied in four scenarios in order of efectiveness: normal light on sunny days, weak light at night, harsh light on sunny days, and rain and fog with low visibility.
Additionally, the comparison reveals that the infuencing factor of image sharpness (S) has less impact on the identifcation accuracy.Te peak in the fold at scene complexity (C � 2) indicates that the trafc complexity (C) near the parking space has a greater impact on identifcation accuracy.

XGBoost-Based Trustworthiness Assessment Model
5.1.Data Description.It is also necessary to obtain specifc values for the diferent factors and the corresponding detection results for each parking space to construct a predictive model for the accuracy of parking detection under diferent factors.Te specifc data focus on the failure probability of parking detection for a single parking space under diferent factors compared to the graded data.
A specifc dataset includes specifc values for the four infuencing factors and detection results.A total of 518 random frames were sampled from the PreScan simulation model under normal light on sunny days by stitching the images together.Part of the specifc data is shown in Table 4.
Te specifc data no longer classify the roadside distance, entry line abrasion, or trafc complexity but remain precise to a specifc value.However, the image sharpness is still divided into two levels, 1 and 2. Te detection results are 0, 1, or 2, corresponding to correct, missed, and false detection, respectively.Each type of data collection is specifed as follows: (1) Roadside distance: in the simulation, the lane width is set to 3.5 m, so the vertical distance from any point in the vehicle's trajectory to the on-street parking space is available.(2) Line abrasion: the values range from 1.7 m (vehicles traveling near the center line of the lane adjacent to the parking space) to 5.3 m (vehicles traveling near the center lines of the two lanes adjacent to the parking space).(3) Scene complexity: the value of the scene complexity is computed by equation 5. (4) Image sharpness: As it is impossible to quantify the degree of camera contamination in the simulation, the clarity of the image data is still graded by setting the camera efects of the sensor.A value of 1, when set to "Default," means that the camera is clear, while a value of 2, when set to "DirtyWindow," means that the camera is contaminated and the collected images are blurred.

Model Description.
Integrated learning is proposed to train and ft the data to classify and predict the results corresponding to any combination of factors.Integrated learning can combine multiple weakly supervised models to obtain a superior, more comprehensive, strongly supervised model.Compared to weakly supervised learning, integrated learning is faster, better in real time, and more accurate.Te XGBoost algorithm is an implementation of integrated learning, which signifcantly improves the speed and efciency.Terefore, we construct an evaluation prediction model based on the XGBoost algorithm to improve the interpretation and prediction of the impact of four environmentally relevant factors on parking space identifcation.

Journal of Advanced Transportation
Te objective function of the XGBoost algorithm is where l is the loss function and constant is a constant term, f t (x i ) is a regression tree, and Ω(f t ) is the regular term (including L1 regular and L2 regular), used to defne the complexity.Tis limits the number of leaf nodes in the tree to avoid the tree being oversized.Te smaller the value of this term, the lower the complexity and the greater the generalization ability.Te expression is as follows: where T is the number of leaf nodes, ‖ω‖ is the mode of the leaf node vector, c is the difculty of the node cut, and λ is the L2 regularization factor.Te ultimate goal of XGBoost is to make the predicted value y i ′ as close as possible to the true value y i with as good a generalization as possible.
Te core idea of the XGBoost algorithm is to continuously perform feature splitting to grow a tree.With each added tree, a new function f(x) is learned to ft the residuals of the last prediction. where is the score of leaf node q, which corresponds to the set of all K regression (regression trees) and is one of the regression trees.When the training is completed to obtain K trees, the predicted value of this sample is the sum of the scores of the corresponding leaf nodes of each tree.
Compared to the classical GBDTalgorithm, the XGBoost algorithm has undergone some improvements, signifcantly improving efectiveness and performance.Te XGBoost algorithm expands the objective function, Taylor, to the second order, preserving more information about the objective function.Te XGBoost algorithm adds a strategy to automatically handle missing value features.Samples with missing values are automatically partitioned by dividing the samples with missing values in the left or right subtree and    comparing the advantages and disadvantages of the objective function under the two solutions.Te algorithm does not require preprocessing of missing features for padding [56].

Data Analysis.
Te 518 data items collected were divided into training and test sets in the proportion 3 : 1. Subsequently, a stratifed K-fold division was conducted to ensure the stability and reliability of the fnal model.Stratifed K-fold division divides the dataset into mutually exclusive subsets and conducts stratifed K-fold crossvalidation.Cross-validation enables all the data to be used as training and test sets, equivalent to expanding the dataset.
We initialize the model using the wrapped classifer and regressor in XGBoost.Some of these model parameters are set as follows: max_depth � 5, learning_rate � 0.1, and n_estimators � 160.A range of indicators for the model was obtained by cross-validation, as shown in Table 5.
Te prediction accuracy of this dataset was obtained by the XGBoost model as 75.97%.Te F1-scores corresponding to correct and missed detection are 0.76 and 0.78, respectively, both at a high level, indicating that the model performs well in predicting these two types of cases.Conversely, the F1-score corresponding to false detections is only 0.33, indicating that the model does not have high trustworthiness of results in predicting such cases.
After stratifying the dataset by four folds and crossvalidating, the scores for each fold and their average scores were obtained, as shown in Figure 8(a).
Figure 8(a) shows that the cross-validation scores for each fold of the dataset were high and reached a mean of 0.73, indicating that the model has good generalization ability.Because the maximum depth of the number is set, the efect of model overftting on the accuracy of the prediction results is circumvented or weakened.
Meanwhile, the relative importance of the four factors afecting the accuracy of on-street parking space identifcation was obtained experimentally, as shown in Figure 8(b).Figure 8(b) shows that roadside distance has the greatest impact on the accuracy of the parking space recognition algorithm, reaching 0.36.Tis is followed by the entry line abrasion and trafc complexity, which are of similar importance at 0.29 and 0.27, respectively.Te infuence of image sharpness was only 0.08, indicating that this factor barely afected the accuracy of the parking space identifcation algorithm.Te results were analyzed only qualitatively, as the volume of data was insufciently large.Te importance of the three factors, roadside distance, entry line abrasion, and trafc complexity, may change as the volume of data rises.Overall, the three are close in importance and cannot be precisely ranked in terms of their impact.

Data Correction.
Te number of false detections is very small compared to the other two types of detections.Te number of missed detections is much higher than the number of false detections when the values of the four factors are close.Te missed and false detections are combined into one category to improve the prediction accuracy of the model trained on the small sample data: incorrect detection, considering that neither missed nor false detection can accurately provide the occupancy status of the parking spaces.Te model then provides only two predictions: correct and incorrect detection.
Te corrected data were imported, and the XGBoost model was retrained.A series of model evaluation indicators were obtained, as shown in Table 6.Te model's prediction accuracy improved from 75.97% to 78.29%, and the F1scores of both predictions and their weighted averages were high, indicating that the model's prediction results have a high degree of trustworthiness.Te stratifed four-fold cross-validation scores before and after data correction were compared, as shown in Figure 9. Cross-validation scores generally improved after fxes were applied to the data.Te average cross-validation score improved from 0.73 to 0.77, and the model's predictive accuracy improved, indicating that the model has excellent generalization ability.

. Conclusion
Tis study proposes a trustworthiness assessment framework for crowdsourcing-based citywide parking availability detection.Four environment-related factors impacting the parking detection algorithm, the distance between the CAV and the target parking space, line abrasion, scene complexity, and image sharpness, are determined through a series of feld and simulation experiments.A failure probability prediction model of parking availability sensing is developed based on the XGBoost algorithm, which can reveal the infuence mechanism of diferent external factors on the data accuracy.Te experimental results show that the average prediction accuracy of the model is 78.29%, enabling the detection vehicle to determine the extent of algorithmic sensory failure while identifying parking spaces.Te impact of the scene complexity is the most pronounced, with camera contamination having a very weak efect.Tis avoids unnecessary trips arising from excessive trust in the results of parking space detection.Te model can efectively assess the trustworthiness of crowdsourced data and signifcantly reduce the impact of quality issues arising from sensor identifcation and incomplete information.

Data Availability
Te data generated during the current study are owned by the Key Laboratory of Road and Trafc Engineering of the Ministry of Education, Tongji University, and are not publicly available.Contact the corresponding author for further details.

3. 1 . 3 .
Parking Status Classifcation.Regularization is required to maximize the classifcation performance due to the varying sizes of parking spaces in the surround-view image.Terefore, parking spaces are cut and warped to a uniform size of 120 × 46 pixels, depending on their position in the image.Perspective transformation techniques are used to implement this warping process.Te four boundary points of the parking spaces in the image are used as source points.Te target points are the four vertices of a fxed rectangle of 120 × 46 pixels.A series of labeled images is thus obtained, and these images are divided into positive and negative samples.Te positive samples are the vacant parking spaces, and the negative samples are the occupied parking spaces.Te number of training samples is then further increased by a 180 °ro- tation transformation.

Figure 3 :
Figure 3: Relevant parameters for sensor installation position and output setting.

Figure 4 :
Figure 4: Judgment of parking space detection algorithm.

Table 1 :
Obstacle category and weight.

Figure 6 :
Figure 6: Comparison of semantic segmentation: (a) before the split; (b) after the split.

D � 1 ,
detection of vehicles less than 2.5 m from the entry line, vehicles driving on or near the track of the center line of the adjacent lane of the parking space D � 2, detection of vehicles in the range of 2.5-5 m from the entry line of the parking space, the vehicle's driving trajectory and the parking space of the adjacent lane of the left lane line overlapLine abrasion (A)A � 1, the white line of the parking space is clearly visible, with only a little abrasion A � 2, some of the parking space lines are worn to invisibility of the white line Scene complexity (C) C � 1, the overall complexity of the model is less than 50 C � 2, the overall complexity of the model is greater than or equal to 50 Image sharpness (S) S � 1, image data are clear S � 2, image data are blurred Journal of Advanced Transportation

Figure 7 :
Figure 7: (a) Te probability of failure corresponding to diferent combinations of factors in four application scenarios.(b) Comparison of results in four scenarios.

Table 3 :
Failure probabilities corresponding to diferent combinations under normal light on sunny days.

Table 4 :
Preview of specifc data.