New Role of Thermal Mapping in Winter Maintenance with Principal Components Analysis

Thermal mapping uses IR thermometry to measure road pavement temperature at a high resolution to identify and tomap sections of the road network prone to ice occurrence. However, measurements are time-consuming and ultimately only provide a snapshot of road conditions at the time of the survey. As such, there is a need for surveys to be restricted to a series of specific climatic conditions during winter. Typically, five to six surveys are used, but it is questionable whether the full range of atmospheric conditions is adequately covered. This work investigates the role of statistics in adding value to thermal mapping data. Principal components analysis is used to interpolate between individual thermal mapping surveys to build a thermal map (or even a road surface temperature forecast), for awider range of climatic conditions than that permitted by traditional surveys.The results indicate that when this approach is used, fewer thermal mapping surveys are actually required. Furthermore, comparisons with numerical models indicate that this approach could yield a suitable verificationmethod for the spatial component of road weather forecasts—a key issue currently in winter road maintenance.


Introduction
On marginal nights in winter (i.e., where the temperature is close to freezing), a difficult decision is often faced by highway engineers of whether or not to treat the road network to prevent ice formation. Traditionally, this decision is facilitated by consulting a road weather information system, consisting of a network of site-specific road weather outstations and associated daily forecasts. Such technology was developed in the 1970s and became a common feature of winter road maintenance in developed countries by the mid-1980s.
Automatic road weather outstations provide measurements of key meteorological and road surface parameters including air temperature, dew point, precipitation, and wind speed and direction. Additional sensors are also embedded in the road surface to provide decision makers with up to date information on the current road surface temperature (RST) and surface state of the pavement. Numerical models are then used to provide road weather forecasts for the outstations sites (with the outstation providing a means for model initialisation and verification). A range of forecast models now exist and a significant body of literature has accrued and Hammond et al. gave a thorough review [1]. However, this approach is ultimately site-specific and with variations of over 10 ∘ C not uncommon around a road network [2], a reliable means of forecast interpolation is also required. This has traditionally been achieved via thermal mapping, but of all the components contained within road weather information systems (RWIS), it is this interpolation that has frequently been identified as the least satisfactory [3].
The thermal mapping methodology is based around IR thermometry, which permits a high resolution survey of road pavement temperatures. Surveys are conducted in winter under a range of atmospheric conditions, ideally just before dawn, to build up a "thermal fingerprint" of RST variations around the route. These fingerprints are then translated into 2 Advances in Meteorology thermal maps and then used for daily forecasting alongside numerical forecast models [2,[4][5][6][7][8][9][10][11][12][13]. Other uses for thermal mapping include the optimisation of routes for anti-icing [14] or identification of locations for the installation of road weather outstations. However, thermal measurement campaigns are very time-consuming. It is impossible to survey a whole road network at the same time, and the task has to be partitioned into stretches that could be done in the open time window at dawn to avoid temperature artefacts due to a rising sun.
The extent of RST variation along a route (and thus the amplitude of the thermal fingerprint) is controlled by atmospheric stability. The greatest variations are observed during stable conditions associated with anticyclonic weather patterns as indicated by Thornes [4]. With decreasing atmospheric stability, the amplitude of the thermal fingerprint subsequently reduces. Shao et al. [7] have shown that under certain weather conditions the spatial variation of RST along a route appears in a consistent pattern. It is this consistency which enables thermal mapping surveys to be conducted under a few selected weather conditions (i.e., extreme, intermediate, and damped) and quantified through the analysis of the average wind speed and cloud cover during the 12-hour period preceding the survey [1]. This has led to an operational standard of five to six surveys (two for each category) typically commissioned to provide coverage of the conditions encountered in a winter season. However this is inadequate with respect to the full variety of winter conditions actually experienced and results in daily forecasts being "pigeon-holed" into one of the categories when used operationally [10].
The last decade has seen a gradual change in practice, moving away from thermal mapping and its associated limitations to a new spatial modelling based approach. Routebased forecasts take into account both meteorological and geographical data to provide a high resolution forecast of road surface temperature and condition around the road network [3]. Whilst this provides a potentially significant improvement in the quality of forecasts, it has also brought about a new set of challenges. Whereas traditional sitespecific forecasts could be easily validated against sensor data from outstations located at the forecast sites, this is clearly impossible for a route-based forecast [1]. Consequently, thermal mapping is still required to provide data to verify the spatial component of a route-based forecasting system. However, this approach is too expensive and time-consuming to provide detailed data at a high temporal resolution and means that route-based forecasts can only presently be verified using "snapshots" from occasional thermal mapping surveys.
The aim of this study is to use principal components analysis (PCA) to statistically analyse thermal mapping data to obtain a means of interpolation between surveys to provide a more comprehensive picture of RST variation for a broader range of atmospheric conditions than that traditionally covered by thermal mapping surveys. Such an approach could have the potential to improve the verification of route based forecasts and to have a cost effective thermal mapping and could even lead to a simplistic model of ice susceptibility for use on road networks.

Study Areas.
Both the University of Birmingham and Nancy Laboratory have been using vehicles for thermal mapping from the beginning of the existence of the technique [1,15]. As such, both organisations have accumulated a substantial quantity of thermal data for analysis. For this investigation, two historic research routes covering both urban and rural areas were selected for detailed analysis. The first route, located in France (Figure 1(a)), is almost 30 km long and covers a range of land uses and lane configurations. A vast thermal dataset is available for this route containing approximately 50 thermal fingerprints obtained under extreme and intermediate weather conditions (Figure 1(c)). The second route (Figure 1(b)) is based in Birmingham, UK, and also contains a range of different land uses, road-types, and lane configurations. This dataset contains approximately 20 thermal fingerprints, collected under extreme, intermediate, and damped conditions (Figure 1(d)). Both roads belong to the same climate classification zone [16,17].

Equipment.
On both routes, RST was calculated by using an infrared radiometer fitted to the underside of a survey vehicle which measures the energy flux density flux emitted by the surface. It is calculated through simple manipulation of the Stefan Boltzmann equation [18]: where is the RST, is the Stefan Boltzmann constant, and is the emissivity of the road surface. The road surface is considered to be a grey body and as such emissivity is held constant at 0.95 [19,20]. In the case of the French study route, air temperature and relative humidity data are also obtained via sensors located on the roof of the vehicle, with an electrical turbine generating a laminar flow. All vehicles are equipped with GPS to facilitate the plotting of measurements in a geographical information system (GIS). Figure 2 displays thermal mapping vehicles used in France and in the United Kingdom.

Principal Components Analysis and RST Forecast.
Principal component analysis (PCA) is a statistical method that enables reduction of dimension by projecting the data onto a lower-dimensional space, when dealing with large datasets [21,22] by nonlinear iterative partial least squares algorithm (NIPALS) or by singular value decomposition. The statistical tool used is the variance-covariance matrix. Linear transformations of a group of correlated variables are obtained in such a way that certain optimal conditions are obtained. The most important of these conditions is that the transformed variables are uncorrelated and resulting in orthogonal eigenvectors. In the PCA approach, the physics that generates the variations is ultimately substituted for a statistical approximation containing a linear combination of current physical factors. The number of initial variables   involved in the description of the physical phenomena resulting in thermal fingerprints is reduced to a lower number called principal components. The data is then projected into another space of the so-called principal components built on the linear combination real physical factors. Calculations are conducted to identify the space gathering the highest variance, generating axis along which data tend to gather. In the case of thermal mapping, each run is considered to be a sample. Each RST series of measurement on the same route (over fifty samples for France and near twenty for UK) corresponds 4 Advances in Meteorology to one data point in a multidimensional space. The variables are attached to the location where the measurements were made. Variables include meteorological, geographical, and road ones, as indicated by Hammond et al. [23], but this is not exhaustive. In the case of a route tens of kilometres long, with measurements done at a given spatial frequency of a few meters, each thermal fingerprint sample contains thousands of points, each being a variable. Among all possible physical variables per location, an illustration of 24 commonly considered affecting RST could be air temperature, relative humidity/dew-point, precipitation, cloud cover, wind speed, solar radiation, ground radiation, weather situation (extreme, intermediate, damped) for meteorological variables, latitude, altitude, topography, screening, topographic exposure, sky view factor, land use, infrastructure specificity (bridges, . . .) for geographical variable and thermal conductivity thermal diffusivity, emissivity, convective coefficient, albedo, traffic, construction depth, and water soil content for infrastructure variables. As an example, the water content of soil is clearly not applicable in the case of a bridge, whereas the convective coefficient is critical. All will vary from location to location, sometimes significantly. Each principal component (PC) axis is then built as a linear combination of these variables (with the ones given in Table 1 among them) multiplied by several thousands of locations. By using the data from several thermal surveys, a data matrix, designed as RST PCA , was generated. The matrix has as many lines as thermal fingerprints available and as many columns as distance points where measurements were performed. Each thermal fingerprint will correspond to one point in the PC space, as illustrated in Figure 3. Clusters of points could be detected in this multidimensional  space. The further along each component, the greater the difference between samples. An approximation of each sample is obtained with the projection onto the first principal components. Similar data points appear close each other, while "extreme" ones appear at an increased distance from the PC space origin. The initial variables have been centred, so that the barycentre of data points corresponds to the PC axis origin. Since the variables are similar (RST with similar variance), the data have not been standardized. The quality of the representation is determined by the residuals (i.e., the distance between each data point to the selected number of axis).
An orthogonal set of eigenvectors, called loadings, is being generated that spans the variance space of the data. Each successive eigenvector is chosen to minimize the residual variance (Figures 4(a) and 4(b)). This operation is performed until the optimal number of principal components selected for the description is reached. Once completed, the samples are represented by a new system of coordinates, or "scores, " and are represented by a matrix . These scores are computed by a linear combination of the initial variables, with given weights. These weights are represented by a loadings matrix usually named .
With respect to thermal mapping, PCA can be used to determine how many components are needed to describe the variability of data constituting the thermal fingerprints of the two routes obtained in various weather conditions. Once these components are identified, they can then be used to build additional fingerprints for other weather conditions provided that they are not significantly different from the ones used for the original PCA calculations. Therefore, the PCA model could be written as follows, with the leftover part of the variations is represented as an error matrix : Hence, the larger the number of thermal fingerprints, the larger the number of principal components available to describe RST variations for PCA. The number of relevant principal components is determined by the explained variance and by the loadings that enable deduction of the physics hidden in the statistical description. The greater the principal components number, the greater its assimilation to noise. By using the scores and loadings matrix, it becomes possible to build a RST profile from PCA, eventually neglecting the error matrix. Based on this approach, the first objective of this paper is to identify the global benefit of PCA and the correct number of thermal mapping runs required to produce an accurate daily temperature pattern along a given route based on PCA. The next objective is to extend the process to build forecast thermal fingerprints, thus providing a simple spatial forecasting model, based on a cost effective and realistic number of mapping runs.
To do so, a RST PCA profile needs to be assimilated to a one-column matrix, where each element of the column is the RST at a point of the itinerary (RST PCA,1 , RST PCA,2, . . ., and RST PCA, ), being the final point of this given route. For two RST PCA profiles, RST PCA 1 and RST PCA 2 , and an interpolated fingerprint, RST PCA interpolated , will be obtained by using (3), traditionally used to denote continuation of form: where is a coefficient whose value ranges between 0 and 1. In winter maintenance, it is logistically impossible to obtain daily thermal fingerprints to verify RST. Instead, there is a dependence on using site-specific RWIS outstations to monitor atmospheric and road parameters such as air temperature, relative humidity, and RST. As such, it is difficult to build a full RST profile over a route on the basis of one single data point obtained from an outstation. Numerical weather models have long been able to provide a forecast for this specific spot or more recently over the whole route using route based forecasting techniques [10]. In the same way, using the RST at a single outstation, RST PCA can be used to extend the forecast away from the outstation site. Here, coefficient is used to match the local data point with

Computing the Optimum Number of Measurements Sets.
The route survey in France contains data collected at a 3 m resolution. For consistency, the 10632 measurement points on this route were reduced and resampled with a moving average to approximate the spatial resolution of the route in the UK which was surveyed at a 20 m resolution. Using the Unscrambler X 10.1 software package, PCA was then performed on the full set of the 53 measured French fingerprints and the 19 measured UK fingerprints. The results are given in Table 2 and Figure 5. Almost all the variability (99%) in the data is explained by the first and second principal components with very few outliers (especially given the size of the dataset), with data gathering around the axis of the first principal component (Figures 5(a), 5(c), and 5(e)). The next analysis focused on a subset of eight and then four fingerprints where the mean RST was below 5 ∘ C, consistent with a winter maintenance configuration. Again, the same conclusion was reached on both the high value of explained variance and the low number of outliers (Table 1 and Figure 5). Figure 5 also shows the absence of clusters of points which could indicate the specific effect of given variables in such global approach. These results indicate that as few as four thermal fingerprints are sufficient to resolve RST around the route, covering a set of weather conditions representative of winter (extreme, intermediate, and damped). The further use of PCA to thermal fingerprints will then have to be performed on data obtained in weather conditions similar to the ones used for this PCA. In theory, PCA would permit the use of 3 fingerprints (i.e., -2 principal components for a set of thermal fingerprints), while originally five to six surveys are specified to have a forecast as adequate as possible when numerical models are used. Indeed, even then with just three fingerprints, 99% of the explained variance is with the first component. However, a fourth thermal fingerprint is recommended to provide a more reliable result, and still constitutes an operational saving on current practice. Ideally, this additional fingerprint would allow further emphasis of one weather condition (e.g., damped condition) in the final result. The loadings in the case of PCA of thermal fingerprints for further principal components will mostly be noise, however contained in these loadings will be thermal singularities (e.g., bridges and decks), which might prove useful for a more detailed analysis on a specific spot but is not the topic of this paper. This is illustrated in the case of the four first loadings of PCA calculations performed on all thermal fingerprints of both the French (Figure 6(a)) and the UK (Figure 6(b)) routes. In such global approach, no specific influence of one of the chosen variables affecting RST could be identified.
Once the initial PCA calculations are performed, thermal fingerprints from PCA are built using (2). A comparison between thermal data and PCA results in the case of four selected French thermal fingerprints and with RST was below 5 ∘ C, which yielded a good fit between the observed RST measurements and PCA results curves (Figures 7(a) and  7(b)). The error distribution indicates that 93% of the error is within a ±0.5 ∘ C interval and 99% is within ±1 ∘ C, confirming the ability of PCA to generate an accurate representation of thermal fingerprints. The same PCA calculations were performed with four UK thermal fingerprints (Figures 7(c) and 7(d)), giving an error distribution of 91% within a ±0.5 ∘ C interval and 98% a ±1 ∘ C interval. A similar PCA calculation was run for two thermal fingerprints of the UK route, corresponding to a damped and an intermediate weather situation. The error distribution indicated that 62% of the error is within a ±0.5 ∘ C interval and 91% is within ±1 ∘ C for only two thermal fingerprints. In the case of five thermal fingerprints on the UK route and five corresponding ENTICE outputs (Figure 7(e)), nearly 47% of the error is within ±1 ∘ C and roughly 80% is within ±2 ∘ C (Figure 7(f)).

Spatial Forecasting Model.
The previous section has demonstrated the ability of PCA to obtain a representation of RST at given temperatures. Next, the ability to build new interpolated fingerprints RST PCA interpolated from the results of the PCA (RST PCA ) is investigated. Such an approach would enable an improved verification strategy for route based forecasting or indeed a basic linear spatial forecasting model in its own right. Figure 8 shows the different RST (field measurements and interpolated from PCA results using (3)) in France and the UK. Using separate testing fingerprints not used in the PCA calculations, it appeared that 84% of the RST difference between statistically interpolated and actual measurements is within a ±1 ∘ C interval. However, in the case of the French route, in a similar PCA configuration but with extreme and intermediate weather situations, the error distribution indicates that only 48% of the data is within a ±1 ∘ C interval. Such differences can be explained by the nature of the two study routes. The UK route has more thermal variations than the French route, a consequence of distinct land use types. This gives rise to a very distinctive thermal fingerprint with very warm urban areas and cold rural areas readily identifiable across the dataset.

Comparison with a Numerical Model Approach.
Route based forecasting was developed by Chapman et al. [3,9] and is essentially an improved road weather prediction model (ENTICE) based around the Thornes model [4], with an added high resolution and site-specific spatial component to predict local variations in RST over both time and space. The   spatial component of ENTICE is driven by a geographical parameter database (GPD) consisting of several geographical parameters affecting RST. There is an extended literature on ENTICE, indicating its ability to explain over 70% of the variation in RST [3,9]. Verification techniques are also extensively discussed by Hammond et al. [1]. The next analysis compares RST forecasts based on PCA (based on only four thermal fingerprints) calculations with ENTICE outputs for the same route as the one described above in this paper. Four thermal fingerprints were then selected covering an average temperature range over the UK route between −1.9 ∘ C and 6.4 ∘ C. These fingerprints correspond to three intermediate and one extreme weather condition. A PCA calculation was run, and RST forecast profiles were built with these results and interpolations. Four RST forecasts from ENTICE were obtained for the same dates used for the PCA corresponding to the ones when thermal fingerprints were obtained. To ease the comparison between forecasts from ENTICE and from PCA (calculations and interpolations), a RST average was then calculated over each forecast file. Considering that the profiles are similar enough, interpolations from PCA were established so as to obtain an average RST as close as possible to the thermal fingerprints not used for PCA calculations and the same RST averages compared to those from ENTICE outputs.
Figures 9(a) and 9(b) summarize the comparison between over 2000 ENTICE outputs and PCA calculations and interpolations. A good fit is obtained between the thermal fingerprints from the numerical model and the PCA with an error distribution analysis indicating that 73% of the error is within ±1 ∘ C and 96% is within ±2 ∘ C. Table 2 gives the PCA overview results, with again 98% of variance explained with the first principal component. RST differences were then calculated between field measurements and PCA interpolations. The PCA model accuracy in predicting RST to be within ±1 ∘ C is such that 84% of the differences are in this interval.
Furthermore, the forecast validation statistics indicates that PCA gives similar results compared to ENTICE. As detailed by Hammond et al [1], the performance of ENTICE and its accuracy (Table 3) greatly depend on the correct choice of physical parameters and of a physical description of the route. For example, a detailed description of road structure for each forecast point is required (e.g., number of layers, layers thicknesses, and thermal characteristics of each layer). However, such details need to be parameterized in route based forecasting as it is presently impossible to measure these parameters at a suitable resolution [24]. Ultimately, each forecast point of a route has a set of changing parameters that makes a proper forecast hard to reach and is the primary reason for noise in forecast verification statistics. Thermal mapping does not have this problem and although it does not explain why the temperature variations are there, it can be used to supplement and verify model output. The inclusion of PCA in this approach now greatly improves the verification capabilities of thermal mapping.

Conclusion
The objective of this paper was to investigate a statistical approach for thermal mapping, based on PCA, to build a road surface temperature forecast for a wide variety of weather situations and temperature ranges. Overall, PCA provided a good forecast of road surface temperature, explaining up to 80% of measurements over a route. The results indicate that, by using this approach, fewer thermal surveys then are currently specified and are required to recreate the road surface temperature forecast pattern along a route.
Further research was then conducted to compare PCA forecast results with outputs from an advanced numerical model. With the exception of "damped" conditions, the results indicate that PCA calculations yielded better results than ENTICE for the error distribution within ±1 ∘ C frame. The main benefit of the PCA approach is that the effects of uncertainties surrounding physical parameterisations in numerical models are overcome without approximation. Uncertainties are then reduced to the ones of the infrared radiometer.
Overall, the use of PCA essentially permits a continuum of thermal fingerprints and allows the user to statistically generate a fingerprint for any given night. This permits the possibility of verifying route based forecasts on nightly basis, without the need for costly additional surveys. Both approaches yield good results in terms of statistical validation and yield further confidence in the route-based forecasting technique. However, there is further potential here. Given the performance of the PCA approach, could this method alone provide a cost effective alternative to route based forecasting, potentially leading to a new generation of forecast thermal maps?