A Precipitation Nowcasting Mechanism for Real-World Data Based on Machine Learning

. Unpredicted precipitations, even mild, may cause severe economic losses to many businesses. Precipitation nowcasting is hence signiﬁcant for people to make correct decisions timely. For traditional methods, such as numerical weather prediction (NWP), the accuracy is limited because the smaller scale of strong convective weather must be smaller than the minimum scale that the model can capture. And it often requires a supercomputer. Furthermore, the optical ﬂow method has been proved to be available for precipitation nowcasting. However, it is diﬃcult to determine the model parameters because the two steps of tracking and extrapolation are separate. In contrast, current machine learning applications are based on well-selected full datasets, ignoring the fact that real datasets quite often contain missing data requiring extra consideration. In this paper, we used a real Hubei dataset in which a few radar echo data are missing and proposed a proper mechanism to deal with the situation. Furthermore, we proposed a novel mechanism for radar reﬂectivity data with single altitudes or cumulative altitudes using machine learning techniques. From the experimental results, we conclude that our method can predict future precipitation with a high accuracy when a few data are missing, and it outperforms the traditional optical ﬂow method. In addition, our model can be used for various types of radar data with a type-speciﬁc feature extraction, which makes the method more ﬂexible and suitable for most situations.


Introduction
Precipitation nowcasting is a task to predict the future rainfall within 0-2 hours timely and accurately, which has always been a significant issue in the weather forecast. It affects people's lives and production greatly, such as providing a reference for daily travel, guiding aviation to improve flight safety. Additionally, it can reduce the losses caused by extreme weather to agricultural production by issuing early warnings. e traditional methods of precipitation nowcasting mainly include numerical model and radar extrapolation using multiple modes of data. Numerical weather prediction (NWP) predicts future rainfall by solving the large-scale equations describing weather evolution based on initial conditions. Although NWP models have been successfully employed for medium-and long-term forecasting, their nowcast capability is still limited due to the challenges of model initialization and considerable calculations. To reduce the complexity, nowcasting by extrapolation of radar echo maps has long been the most effective tool for precipitation nowcasting [1]. e optical flow method is the novel extrapolation technique, which predicts the weather through the following two steps, namely, tracking and extrapolation. For tracking, the velocity field can be computed from a series of consecutive maps. In the extrapolation step, the predicted maps can be obtained based on the velocity field. However, the optical flow method has some limitations. Since the velocity estimation step and the radar echo extrapolation step are separate, the determination of parameters becomes difficult. Due to the shortcomings of traditional methods and the development of machine learning, this has gradually aroused the interest of the machine learning community [2].
At present, there are mainly two solutions in the field of machine learning. One solution is to predict the future radar echo maps in the future based on the historical maps. For instance, Shi et al. proposed the Convolutional LSTM Network (ConvLSTM) [3], an LSTM model with the convolutional structure. Subsequently, they proposed the Trajectory Gated Recurrent Unit (TrajGRU) [4], where the main difference from ConvLSTM is the recurrent connections were dynamically determined. e convolution operation in ConvLSTM and the location-variant relationship in TrajGRU helped the models to capture the spatiotemporal movement. And their experiments showed that the models can roughly predict the cloud's motion. However, the generated predicted radar echo maps would become more and more smooth until there was nothing in the map. erefore, it was almost impossible to give sharp and accurate predictions of the global radar maps in longer-term predictions.
In addition to RNN-based models, Ayzel et al. proposed RainNet [5] inspired by U-Net [6], a deep convolutional neural network for radar-based precipitation nowcasting. ey trained RainNet to predict continuous radar echo maps at a lead time of five minutes based on the past 4 consecutive maps. To achieve a lead time of one hour, a recursive approach was implemented by using predictions at five minutes lead time as model input for longer lead times. However, this recursive approach was repeated up to a maximum lead time of 60 minutes. In several applications, convolutional neural networks have turned out to make more accurate predictions than these recurrent neural networks (RNN) [7]. Unfortunately, the spatial smoothing was an undesirable property of RainNet prediction and became increasingly apparent at longer lead times. In addition, this method is inapplicable to the long-term prediction of more than 60 minutes. e second solution is to use the features as model inputs to predict future precipitation, where the features are related to future precipitation trends. In the CIKM AnalytiCup 2017 challenge, contestants were provided with radar images within 1.5 hours and 4 different heights. Considering that there were many overlapping areas between image samples, the Marmot team performed template matching and image stitching before extracting features to obtain global images [8]. en, they fed the features extracted from the global reflectivity images into the convolutional neural network model. However, sufficient overlapping areas are required between samples. Otherwise, the size of the stitched global image would be similar to the original image, or it could not be stitched at all. e global features hence could not be taken, which would affect the accuracy of prediction. In particular, for radar reflectivity data with single altitudes or cumulative heights, it was hard to get global images through template matching and stitching. e main reason may be that the same area in the radar maps of cumulative reflectivity at various altitudes is much less than the radar map of multiple independent altitudes.
In the real world, some cases of partial data lack are inevitable, but few researches have been conducted to address this problem. For the RNN-based methods, since the number of radar images input to the model is fixed, the missing data will lead to the model inoperable. For the second solution, however, it is reasonable to fill in missing maps with maps of adjacent time or altitude flexibly, which has little effect on the next steps of feature extraction and model training. Meanwhile, the filling will not lower the accuracy of the prediction too much because the extracted features can reflect the precipitation trend from different aspects.
In this paper, we adopt the second solution. First, we extract the representations that are closely relevant to future precipitation trends. e representations are then fed into several machine learning models, namely, convolutional neural network (CNN), neural network (NN), and gradient boosting decision tree (GBDT). Finally, we summarize the model results to get the predicted precipitation. For evaluation, we build a new realistic dataset, which contains radar echo maps, actual rainfall measured by ground monitoring stations, and altitude of monitoring stations. Considering that some radar data are missing, to obtain as many training samples as possible, and in line with reality, we kept the samples in which the number of missing radar data did not exceed 3 when building the dataset. And the dataset built in this way has little influence on the prediction performance. Furthermore, we utilize the traditional methods based on optical flow from the rainymotion library as benchmarks for comparison. At last, the common metrics are utilized for evaluation, such as root mean square error (RMSE) and mean absolute percentage error (MAPE). e main contributions of the paper are twofold: (1) Given that there may be a few data missing in reality, to reserve as many samples as possible, we retain samples in which a few radar data are missing and tested the efficacy which shows a good result. Similarly, any real dataset that contains a few missing data can be resolved in this way and hence remedy the missing data weakness. (2) We propose a precipitation nowcasting mechanism that is suitable for radar reflectance data with single or cumulative altitudes. e experimental results show that it has a low RMSE for precipitation nowcasting and outperforms the traditional method based on optical flow. e paper is organized as follows. In Section 2, we briefly introduce the traditional method of precipitation nowcasting and machine learning methods. Section 3 describes our way to build the dataset, as well as the response measures to data missing. We then present our precipitation nowcasting mechanism in Section 4. Section 5 introduces the benchmarks and makes evaluations and analysis of the results. Finally, in Section 6, we conclude the paper.

Traditional Methods.
Numerical weather prediction (NWP) is one of the traditional methods to predict atmospheric movement and weather phenomena in the future. For this method, the future weather can be calculated by solving the mathematical and physical equations describing atmospheric motion based on the initial state and boundary conditions of the atmosphere. However, since the scale of strong convective weather is smaller than the minimum scale that the model can capture, such as short-term heavy precipitation and hail, their formation and evolution cannot be described by the NWP model accurately. Meanwhile, their nowcast capability is still limited due to the challenges of model initialization and large calculations. Besides, it relies on the equations, which may mean that excessive regularization will make a negative impact on the accuracy of the forecast. erefore, the numerical models are not well applicable for precipitation nowcasting.
At present, some computer vision technologies, especially the optical flow method, have been proved to be useful for extrapolation of radar echo images [9]. For example, Hong Kong Observatory (HKO) proposed the Real-time Optical flow by Variational methods for Echoes of Radar (ROVER) [10]. e forecasting process is mainly two steps. Firstly, the motion vector field of the echo is obtained by calculating the optical flow field of the radar echo. With the assumption that the vector field is unchanged, the radar echo maps can be extrapolated based on the vector field. However, it is difficult to determine the model parameters because the estimation of the vector field and the extrapolation step are separate. Although this method can predict the cloud's movement trend, it cannot predict the cloud's dissipation. As for the limitations of traditional methods, with the improvement of machine learning, novel methods that use machine learning to solve the precipitation nowcasting are sprouting.

Methods Based on the Neural Network.
Since traditional prediction methods are not suitable for dealing with the nonlinear relationship of precipitation factors, researchers began to try to utilize neural networks to predict precipitation. However, there are two issues when using neural networks in the radar echo extrapolation. In essence, precipitation nowcasting is a spatiotemporal sequence forecasting problem with the sequence of historical radar maps as the input and the sequence of a fixed number of future radar maps as the output [3]. Besides, the prediction model needs to well capture the spatiotemporal structure of the data. In [3], Shi et al. formulated precipitation nowcasting as a spatiotemporal sequence forecasting problem in which both the input and the output are spatiotemporal sequences. ey proposed a convolutional LSTM (ConvLSTM) network based on the fully connected LSTM (FC-LSTM) [11]. Unlike FC-LSTM, which used a fully connected structure in stateto-state transition, ConvLSTM utilized convolution in input-to-state and state-to-state transitions. en, an end-toend trainable model could be built by connecting multiple ConvLSTM layers. e prediction model hence can well capture the spatiotemporal structure of the data with the help of the convolution and the LSTM's memory. In their experiments, they demonstrated that the ConvLSTM model had better performance than the ROVER algorithm. Later, Shi et al. proposed the Trajectory GRU (TrajGRU) [4] based on ConvGRU [12]. Considering that natural motion and transformation are location-variant in general, the recurrent connections in TrajGRU were dynamically determined. erefore, the model can actively learn the location-variant structure for recurrent connections and had been verified on the HKO-7 [4] benchmark proposed in their paper. From their results, the TrajGRU model outperformed the ConvLSTM model for precipitation nowcasting. However, for both RNN-based models, the radar echo images in the prediction sequence would become more and more blurred with increasing lead time until there was nothing in the prediction images. Although these RNN-based methods can predict the cloud's motion roughly, it is almost impossible to give sharp and accurate predictions of the global radar images in longer-term predictions.
In addition to the RNN-based method, Ayzel et al. presented the RainNet [5] inspired by U-Net [6] to predict shortterm precipitation. e model followed an encoder-decoder architecture in which the encoder downscaled the spatial resolution and the decoder upscaled the learned patterns to a higher spatial resolution. en, the model was trained to predict the radar echo map at a lead time of five minutes with 4 consecutive radar echo maps as input (radar echo maps are generated every 5 minutes). Furthermore, they adopted skip connections between the encoder and the decoder to ensure semantic connectivity. Furthermore, they implemented a recursive approach by using prediction at five minutes' lead time as the next model input for longer-term prediction. Beyond the lead time of five minutes, however, the increasing level of smoothing is a shortcoming of the RainNet, which is mainly because the error generated in each step of prediction will be accumulated in the recursive application. Although this method is comparatively more stable than the RNNbased method, this recursive approach was repeated up to a maximum lead time of 60 minutes. e method motioned above is to predict the future radar echo maps in the future based on the historical maps. Besides, another solution is to use the extracted features as model inputs to predict future precipitation, where the features are associated with precipitation trends. To obtain global features, the Marmot team in the CIKM AnalytiCup 2017 challenge performed template matching and stitching on the original radar image samples to achieve the radar echo images of large spatial resolution [8]. Subsequently, they extracted features related to future rainfall from global maps. At last, a CNN model produced the predicted precipitation after feeding the extracted features into the model. Since the echo maps within 1.5 hours and 4 heights provided by the organizer had many overlapping areas, the global maps could be achieved through template matching and stitching. erefore, their method required enough maps in which there were many same areas. However, for radar reflectivity data with single altitudes or cumulative heights, it is almost impossible to get global maps through matching and stitching because the same areas in the radar maps are much less than the radar map of independent altitude. Hence, it is hard to achieve global features and produce accuracy prediction.

Hubei Dataset
e primary data include the radar reflectivity data, the hourly precipitation data measured by the ground monitoring station, and the information about the station. e radar reflectivity data (the radar reflectivity represents the scale and density Mathematical Problems in Engineering distribution of precipitation particles inside the meteorological target and is used to represent the intensity of the meteorological target) is the radar echoes composite map from the Severe Weather Automatic Nowcasting (SWAN) system, which comes from 109 periods of heavy precipitation from July 2008 to August 2017 in Hubei province. e radar echo maps, which cover an 800 × 1200 km area and have a resolution of 800 × 1200 pixels, are generated every 6 minutes, some of which were missing, possibly due to the failure of the radar monitoring station and improper handling and collection.
ey are splicing images of multiple radars and cumulative reflectivity at some heights. Furthermore, the hourly precipitation is recorded by the ground monitoring stations which are distributed in the Hubei area. And the information contains station number, recorded time, precipitation (mm/h), longitude, latitude, and altitude of the stations.
Since machine learning is a data-driven discipline, based on massive different samples, it is essential to build a dataset containing various real-world scenes (see Figure 1). Firstly, we filter the hourly precipitation data to retain the precipitation records (> 0 mm/h). In the second place, we select the maps in the adjacent 4 hours of the precipitation time (40 radar images if not missing). Finally, we clip the surrounding area (301 × 301 km) of the ground station in each image according to the station's location. e dataset is available on request.
Given the problem of data missing, if we simply discard incomplete samples with less than 40 frames, the number of remaining samples will not be sufficient for model training that requires enough inputs to obtain good performance. Hence, to keep enough number of input samples, we reserve all samples with at least 37 frames. Provided that the purpose is to predict precipitation according to the maps in the 1-2 hours and the sample is complete, the 10th to 20th maps in the sequences will be selected. Moreover, the selection operation remains unchanged for the incomplete sample. If there was a frame missing within 0-2 hours in the sequence, the first frame in 2-3 hours would be selected in practice. And it would not affect the selected frames if the missing occurred in 2-4 hours. In the real world, we often replace missing data with nearby time data. erefore, our dataset is in line with the real world. Besides, it is the events of small probability because the number of incomplete samples is less than that of complete samples. Since the machine learning models are robust, the prediction accuracy will not be affected too much. erefore, our dataset is realistic and reasonable. In total, we got 2502 samples, including 1099 samples containing 40 radar echo maps, 413 samples containing 39 maps, 538 samples containing 38 maps, and 452 samples containing 37 maps. Since the intensity of precipitation is not evenly distributed, we have made statistics on different rainfall and obtained the ratio of different intensity (see Table 1).

Purpose and Dataset.
Our purpose is to predict the precipitation at the target station within 1-2 hours in the future using the radar historical map sequence in the past 1.5 hours. e target station is located at the center of the radar echo maps, and the spatial resolution of the maps is 101 × 101 km. For this purpose, we selected a subset of the Hubei dataset (see Figure 2), in which each sample includes the radar echo maps of the past 1.5 hours, the measured hourly precipitation, and the geographical location of the target station. Among them, 2300 samples are used for training, and the remaining 200 samples are used for evaluation. In preprocessing, we transform the intensity values Z to gray-level pixels P by setting P � 200 × ((Z + 10)/70) + 0.5.

Mechanism Flow.
e precipitation nowcasting task is to predict the future precipitation trend at the location of the target station according to the historical radar maps of a certain area around the target station, which can be viewed as the prediction based on the spatiotemporal sequence. Since the evolution of clouds conforms to some rules, the key to solving the task is to discover and extract the representations that reflect the rainfall trend. Our mechanism has four main steps (see Figure 3), namely, trajectory tracking, feature extraction, model training, and weighted summary of models' results. In the first step, we obtain the velocity field by calculating the displacement of the same key point between two maps at adjacent times. And the cloud trajectory can be tracked according to the velocity. In feature extraction, we extract four types of features, including the radar reflectivity images, the general description of cloud, the representation of cloud's trajectory, and the altitude of the ground station. en, we train three machine learning models, respectively, using different features. e main model is the convolutional neural network (CNN), and the two auxiliary models are the neural network (NN) and gradient boosting decision tree (GBDT). Finally, the results of the three models are weighted and summed to generate the final predicted precipitation.

Trajectory Tracking. According to the Taylor Frozen
Hypothesis in fluid mechanics [13], there is a significant spatiotemporal correlation in the flow field. erefore, we assume that the cloud's motion is persistent in the short term. Besides, the cloud's shape and the reflectivity remain unchanged nearly.
erefore, the signal f at x can be represented by the signal f′ at x − Uτ, where U is the average convection velocity and is a relatively short time interval. To obtain the convective velocity U of each point in maps, at first, we match the SIFT descriptors of the key points between two consecutive maps. If they match successfully, they are the same points (see Figure 4). en, the translation δx can be calculated from the coordinates of the same key points. Since there is a short interval time δt, the convection velocity of each point can be obtained by U � δx. Finally, we obtain the trajectory using the average convection velocity as the velocity of clouds (see Figure 5, the yellow triangle in the top images is the backtracking trajectory).

Feature Extraction.
Since the reflectivity images are the radar echo maps, in which there are a great spatiotemporal correlation and the evolution rules of rainfall, it is crucial to extract the representation that can reflect the precipitation trend. Here, we extract three types of features, including the radar reflectivity images, the general description of cloud, the representation of cloud's trajectory, and the altitude of ground station. For comparison, we selected different features as the input of models.
Radar reflectivity images: here, we extracted 3 different radar reflectivity image features (see Figure 5). (1) e local reflectivity images on the trajectory. Knowing the historical trajectory of each point, we get the past coordinates of clouds, which will move to the target station in the future.
en, we cut out the 41 × 41 area around the coordinates. (2) e central reflectivity images (41 × 41) of the last 3 frames in the sequence. (3) e reflectivity images (101 × 101) of the last 3 frames in the sequence without any cropping.
General description of cloud: different cloud cluster types correspond to different precipitation events; we hence extract some statistics and physical vectors from maps to represent the cloud characteristics as much as possible, including cloud coverage, velocity, acceleration, the histogram of typical SIFT descriptors, average reflectivity intensity, maximum intensity, the standard deviation of intensity, and histogram of intensity.
Representation of the cloud's trajectory: to find the representation that can reflect the precipitation trend, we calculate the temporal difference in the reflectivity intensity. erefore, we choose the change of cloud coverage and the statistics of reflectivity on the cloud's trajectory (such as the maximum, minimum, and mid-value) as characterization of cloud's trajectory. e altitude of the ground station: considering that the topography can not only influence the formation of precipitation but also affect the distribution of precipitation, in general, as the altitude increases, the precipitation increases. erefore, we add the altitude of the ground station as one of the nonimage features.

Models' Training.
For different types of features, we utilize 3 models for training and evaluation, namely, convolutional neural network (CNN), neural network (NN), and gradient boosting decision tree (GBDT). And the main model is CNN. To make the model converge better, we enlarge the label (measured precipitation) by 10 times. At last, we get the 3 predicted values that correspond to the models.
Convolutional neural network: the input of the CNN model contains image features (3 types of radar reflectivity images) and nonimage features (the general description of cloud, the representation of the cloud's trajectory, and the altitude of the ground station). Meanwhile, we train models using different features and make a comparative analysis of the results. For details, see Section 4.2.2. Due to the different dimensions of the input features, the structure of the CNN model is slightly different. If the image features are the local reflectivity images on the trajectory or the central reflectivity images, the dimensions of which are 41 × 41 × 3, they are fed into 3 convolutional layers, and each layer includes a convolution kernel and a max-pooling kernel. e final output of the convolutional layer is flattened and combined with other nonimage features. At last, we get the prediction after 3 fully connected layers (see Figure 6). Furthermore, if the image feature is the radar reflectivity image (101 × 101 × 3) without cropping, we add a convolutional layer, and there are 4 convolutional layers in total (see Figure 7). In model training, we set the loss function to mean square error (MSE) and adopt the Adam optimization with the learning rate equal to 1.5 × 10 − 3 .
Neural network: the input of the NN model is the same as the nonimage features fed into the CNN model. After 3 fully connected layers, we get the predicted precipitation (see Figure 8). Meanwhile, we adopt the MSE loss function and Adam optimization (learning rate � 2 × 10 − 3 ).
Gradient boosting decision tree: GBDT is one of the ensemble learning algorithms, which focuses on the residual between the last weak classifier and the correct answer. Each training of the classifier is to reduce the last residual. In turn, it builds a new model in the direction of reducing residuals to further reduce residuals. e input of this model is the same as the NN model. In training, we set the number of weak classifiers to 200, the maximum depth of each learner (regression tree) is 1, and the learning rate is 0.03.

Weighted Summary of Models' Results.
To get a comprehensive prediction from the models' results mentioned above, we assign weights and sum the output of the 3

Benchmarks.
For evaluating our method, we utilize the models for quantitative precipitation nowcasting from the rainymotion library as benchmarks [14]. e models from rainymotion consist of different combinations of algorithms for the two major steps of the Lagrangian nowcasting framework, namely, tracking and extrapolation [14]. For tracking, the velocity field can be computed from a series of consecutive maps through the Dense Inverse Search (DIS) algorithm [15] proposed by Kroeger et al. In the extrapolation step, the precited maps can be generated by displacing the clouds based on the observed maps and the velocity field. In our experiment, we utilize the Dense group models from rainymotion, namely, the DenseRotation model and Dense model. e difference between these two models is the extrapolation step. e Dense model uses a constant-vector advection scheme [16], while the DenseRotation model uses a semi-Lagrangian advection scheme [17]. In our experiments, we feed the 15 radar echo maps of the past 1.5 hours into the Dense group models. After setting the number of output images to 20, we obtain the predicted sequence in the next 2 hours. Given that our purpose is to predict the hourly precipitation in the future 1-2 hours, we select the last 10 maps in the predicted sequence and crop out the central area (15 × 15) around the target station. To calculate the predicted precipitation, we adopt the Z-R relationship (the Z-R relationship is the relationship between the rain radar reflectivity Z and the rainfall intensity R, which is usually defined as Z � aR b ) and the maximum reflectivity of the cop area. Here, the Z-R relationship is fitted on our dataset, where a � 353.73 and b � 2.44.

Evaluation and Analysis of Experimental Results.
For evaluating the prediction performance, we utilize two common metrics, namely, root mean square error (RMSE) and mean absolute percentage error (MAPE). e RMSE, defined as equation (1), reflects the expectation of squared error between the predicted rainfall P and the ground truth G. And the MAPE is defined as equation (2) and illustrates the relative error. In our experiments, we adopt 3 machine learning models (CNN, NN, and GBDT) with different features including 3 types of reflectivity image features and nonimage features mentioned above. Furthermore, we use optical flow models in rainymotion as benchmarks with the 15 radar echo maps of the past 1.5 hours.
From the results (see Table 2 and Figure 9), we find that the altitude factor and the local reflectivity images on the         Mathematical Problems in Engineering trajectory have a positive impact on improving prediction performance. As for the 3 machine learning models, the NN model is the least consequential. e prediction performance when using the CNN model and GBDT is better than that of single models, while in the three single models, the CNN model performs best. However, the prediction ability for severe precipitation is relatively weak, which may be due to the small number of samples with heavy rainfall. Moreover, our mechanism outperforms the Dense group model in rainymotion, which is one of the traditional optical flow methods. Although our mechanism has few advantages over the DenseRotation model, more samples will improve the accuracy of our mechanism based on machine learning.  Table 2, respectively.
Mathematical Problems in Engineering

Conclusion and Future Work
In this paper, we built a real-world dataset with a few radar data missing. is dataset is composed of radar reflectivity maps, precipitation, and rainfall time recorded by ground monitoring stations, and the altitude, latitude, and longitude of stations. Meanwhile, we analyzed the possible impact of missing data on the experiment and concluded that our way to deal with the missing data not only keeps most of the samples but also is in line with reality and does not have a large negative impact on prediction performance. Furthermore, we proposed a precipitation nowcasting mechanism that was suitable for radar reflectivity data with single altitudes or cumulative altitudes, which showed better performance than the traditional optical flow methods. e general idea of our mechanism was to predict hourly precipitation in the future 1-2 hours at the stations using the features extracted from the historical radar map sequence and machine learning models. e features contained 3 types of radar reflectivity images and some nonimage features, including the general description of cloud, the representation of the cloud's trajectory, and the altitude of ground monitoring stations. And the models were CNN, NN, and GBDT. To obtain a comprehensive prediction from the models' results, we assign weights and sum the results of the 3 models. Finally, we view the comprehensive result as the final predicted precipitation.
From the experimental results, it can be seen that our mechanism outperformed the traditional optical flow methods. For the reason that the number of samples in our dataset was small, the accuracy of our mechanism was to be limited. We believed that more samples with various realworld scenes could contribute to the performance of our mechanism based on machine learning. Furthermore, the features extracted from radar maps had a close connection with future rainfall. And the CNN model we used worked best compared with the other 2 models.
Here, we did not conduct comparison experiments on other neural networks, such as ConvLSTM [3] and RainNet [5], mainly for the following three reasons. First, the output of the above networks was radar echo maps instead of precipitations. If we want to get the precipitation, we have to introduce the Z-R relation to calculate the final precipitation. However, the Z-R relation should be adapted to local conditions and correspond to different types of clouds, so the commonly used fixed Z-R relations cannot give us an accurate precipitation result here. Second, the RNN and deep CNN models used in these networks could not converge well with insufficient samples in our scenario. e third reason was that, in [3,5], it showed that the prediction sequence of ConvLSTM and RainNet would become more and more blurred with increasing lead time until there was nothing in the prediction images. However, our aim was to predict the precipitation at the target station within 1-2 hours in the future using the radar historical map sequence in the past 1.5 hours, in other words, using a sequence of 15 images in the past to predict a sequence of 20 images in the future. is will lead to inaccurate predictions according to [3,5].
In our solution, to improve the accuracy of nowcasting forecasting, the key was extracting features that reflected future rainfall trends. Moreover, appropriate models were crucial. Given that the frequencies of different rainfall levels were highly imbalanced (see Table 1), we were thinking about adopting the weighted loss function. Meanwhile, to evaluate the prediction performance more convincingly, we were considering adding some common meteorological indicators, such as critical success index (CSI), false alarm rate (FAR), and probability of detection (POD). In short, replacing or combining traditional meteorological methods with machine learning methods, although still in the initial stage, had already achieved some remarkable results. We thought this would become a trend in the future and would bring more convenience to people's daily production and life.

Data Availability
No data were used to support the findings of this study.

Conflicts of Interest
e authors declare that they have no conflicts of interest.