The Applicability of Bipartite GraphModel for Thunderstorms Forecast over Kolkata

Single Spectrum Bipartite Graph (SSBG) model is developed to forecast thunderstorms over Kolkata (22◦32′N,88◦20′E) during the premonsoon season (April-May). The statistical distribution of normal probability is observed for temperature, relative humidity, convective available potential energy (CAPE), and convective inhibition energy (CIN) to quantify the threshold values of the parameters for the prevalence of thunderstorms. Method of conditional probability is implemented to ascertain the possibilities of the occurrence of thunderstorms within the ranges of the threshold values. The single spectrum bipartite graph connectivity model developed in this study consists of two sets of vertices; one set includes two time vertices (00UTC, 12UTC) and the other includes four meteorological parameters: temperature, relative humidity, CAPE, and CIN. Three distinct ranges of maximal eigen values are obtained for the three categories of thunderstorms. Maximal eigenvalues for severe, ordinary, and no thunderstorm events are observed to be (2.6 ± 0.12), (1.88 ± 0.09), and (1.26 ± .03), respectively. The ranges of the threshold values obtained using ten year data (1997–2006) are considered as the reference range and the result is validated with the IMD (India Meteorological Department) observation, Doppler Weather Radar (DWR) Products, and satellite images of 2007. The result reveals that the model provides 12to 6-hour forecast (nowcasting) of thunderstorms with 96% to 98% accuracy.


Introduction
Thunderstorm is a mesoscale weather phenomenon with space scale varying from a few kilometers to a couple of 100 kilometers and time scale varying from less than an hour to several hours.Severe thunderstorms create lot of damages to the properties and crops, human, and animal fatalities through strong surface wind, lightning, large hail, and occasional tornadoes.Every year, during the premonsoon months of April and May, Kolkata (22 • 32 N, 88 • 20 E) encounters with severe thunderstorms which are locally known as Nor'westers or Kalbaishakhi.Forecasting severe thunderstorms is a challenge for both meteorologists and atmospheric scientists in India because such highly nonlinear and chaotic phenomena may incur significant detrimental consequences on agricultural productivity and life [1].The deterministic chaos inherent in the time series of thunderstorms occurrence has been identified [2].In the era of the state-of-the-art computing techniques, sophisticated and precise computational methods are coming up that can aid in the study of complex atmospheric processes [3][4][5][6][7][8][9] and others.There are various conventional methods for day-today forecast of thunderstorms like synoptic weather charts, thermodynamic diagrams (T-Φ gram), Radar observations, and statistical and numerical models.The conventional methods of forecasting the small scale weather phenomena have some limitations [10,11] because of the nonavailability of close network of observatories and thus, might deviate from accurate forecast.Thunderstorms are perennial features of India; however, the genetic basis, structure, evolution process, and the dynamics of thunderstorms vary with seasons and locations.Premonsoon (April-May) thunderstorms of Kolkata are the most devastating weather leading to major loss of life and property on the surface and aviation hazard aloft.It is associated with towering cumulonimbus (Cb), high frequency of lightning, large hail, occasional Figure 1: Single spectrum bipartite graph (SSBG) with two set vertices V T (V 1 , V 2 ) and V P (V 3 , V 4 , V 5 , V 6 ).tornadoes, and very strong surface wind.The premonsoon thunderstorms have significant socioeconomic impact over this region.This study aims to explore the applicability of Graph Theory for forecasting premonsoon thunderstorms over Kolkata.The advantage of the approach is that it can adopt all the complexity, nonlinearity, and inherent chaos of a system in its heuristic framework.The necessity of forecasting thunderstorms [12][13][14][15] with considerable lead time (at least 12 hours) and accuracy (at least 90%) led to select some relevant thermodynamic and dynamic parameters as the input of the model from ten years (1997 to 2006) data analysis [15][16][17].Plethora of literature is available which shows the applicability of statistical and numerical methods to forecast thunderstorms [18].For example, Davies (2004) estimated CIN and Level of Free Convection (LFC) associated with tornado and supercell thunderstorm activity [19].Huntrieser et al. (1997) have compared the traditional and newly derived convective indices with their statistical forecast skills over Switzerland [20].Michalopoulou      (1987) have used statistical method for some convective parameters to forecast thunderstorms over Cyprus [21].Shafer and Fuelberg (2006) discussed a statistical procedure to forecast warm season lightning [22].
In the present study, SSBG model output shows distinct ranges of maximal eigenvalues for severe (2.6 ± 0.12), ordinary (1.88 ± 0.09), and no thunderstorm (1.26 ± .03)events.These ranges are used as the reference range for the prediction and the result is validated with the observation of 2007.The model provides 12-to 6-hour forecast (nowcasting) of thunderstorms with 96% to 98% accuracy.

Data
The data source for the study is India Meteorological Department http://www.imd.ernet.inand http://www .weather.uwyo.edu.The lightning data are collected from World Wide Lightning Location Network satellite images (http://webflash.ess.washington.edu/).Hourly (INSAT) KALPANA-1 imagery (IR) during the study period (April-May) is also taken for the study.The data are collected for the months of premonsoon season (April and May) within the period from 1997 to 2007.The location of the study is Kolkata (22 • 32 N, 88 • 20 E).The input variables provided in the present study are the upper air RS/RW sounding data and the record of thunderstorms.Some significant meteorological parameters like temperature (T), relative humidity (Rh), convective available potential energy (CAPE), and convective inhibition energy (CIN) are taken as the input to develop the SSBG model.Among the four parameters, the first two are the observed parameters and the last two are the derived parameters.The severity of thunderstorms is estimated by the wind speed of 70 km/h or 38 knots, high frequency of lightning, and intense cloud mass (evident from satellite imageries) whereas the wind speed for ordinary thunderstorms is observed to be 50 km/h or 27 knots, low frequency of lightning, and small cloud patches (evident from satellite imageries).

Methodology
The normal probability distribution function [23] is used as the statistical tool to identify the most probable range of values of the selected input parameters (T, Rh, CAPE, CIN) for the occurrence of thunderstorms (Table 1).The  method of conditional probability is adopted to ascertain the possibilities of occurrence of thunderstorm within the threshold ranges.SSBG model (Figure 1) is developed to view the pattern of the eigenvalues for thunderstorm and non thunderstorm days.Identification of the pattern of eigenvalues for different categories of thunderstorms (severe, ordinary, or no thunderstorm) at 12 to 6 hours before the occurrence of the thunderstorms facilitated to develop the forecast model.

Graph Theory-An Overview.
Many real world situations, besides theoretical mathematics and computer science, can be conveniently explained by a graph containing few points and lines.Basically a graph consists of a set of vertices (V i ) and a set of edges (E j ) that can be expressed as (1) Graph can be represented by its incidence and adjacency matrix.
The Adjacency matrix A = (a i j ) nxn of a graph G is defined as [24]  A nonempty graph G is called connected if any two of its vertices are linked by a path in G.The connectivity of a graph is an important measure of its robustness as a network.
In a bipartite graph the vertices can be divided into two disjoint sets U and V such that every edge connects one vertex in U and one in V ; that is, there is no edge between two vertices in the same set.Bipartite graphs are useful for modeling matching problems and are extensively used in modern coding theory, especially to decode the code words received from the channels.
Spectral analyses in graph theory facilitate in fixing the bounds on the distributions of eigenvalues.Some eigenvalues have been referred to as the algebraic connection patterns of a graph [25].Graph eigenvalues have applications in wide areas and in different pretexts.However, the fundamental mathematics of spectral graph theory through all its connections to the pure and applied, the continuous, and discrete can be viewed as a single united subject.There are different lemmas and propositions of spectral graph theory to study the variation characterization of Eigen values, their bounds, and orientations [26].

Implementation Procedure.
The endeavour of the present study is to develop a bipartite graph connectivity model (Figure 2) to forecast premonsoon thunderstorms over Kolkata by assigning threshold values to the input parameters.The input parameters considered in the present study are the temperature (T), relative humidity (Rh), convective available potential energy (CAPE), and convective inhibition energy (CIN).Ten-year data analyses during the period from 1997 to 2006 for the premonsoon months of April and May led to assign the threshold values to the meteorological parameters (Table 1).The threshold values of the selected parameters are obtained using normal probability distribution of ten years data set (Table 2).The values are assigned to the input parameters by observing the normal probability distribution of the parameters.The probability densities are observed to be maximum for the threshold ranges (Table 3).
Conditional probability is used to corroborate the assigned threshold values of the parameters for the occurrence of thunderstorms (Table 3).
The analysis is restrained in the computation of maximal eigenvalue and their actual ranges and also the variation of different categories of thunderstorms (severe, ordinary, and no thunderstorm events) from the mean.Figure 4 shows the patterns of the eigenvalues corresponding to thunderstorm days.The spectra of the graph model show variable eigenvalues.This is apparent because, as the positions of the vertex changes, the input in the adjacency matrix also changes and that reflects in the change in the eigenvalues.Thus, spectra     of the bipartite graphs may lead to an inconclusive result and the actual aim of forecasting thunderstorm might be deviated.The bipartite graph model is thus restricted to a single spectrum bipartite graph (SSBG) (Figure 1) in the present study.The schematic of the forecast model using SSBG is shown in Figure 2.

Results and Discussion
The analysis is done with ten-year data of premonsoon season from 1997 to 2006.The data and record of 112 thunderstorms are collected for the station Kolkata (22 • 32 N, 88 • 20 E) and thus, 112 bipartite graphs are constructed.The bipartite graphs in the model consist of two sets of vertices (Figure 1), one with time (V T ) and the other with the four parameters (V P ).The vertex list for V T is {v 1 , v 2 }, where v 1 ⇒ 00GMT, and v 2 ⇒ 12GMT whereas for V P , the vertex list is {v 3 , v 4 , v 5 , v 6 }, where v 3 ⇒ T, v 4 ⇒ Rh, v 5 ⇒ CAPE, and v 6 ⇒ CIN.There will be a path between two set of vertices, V T and V P , for a particular thunderstorm day if the values in the vertex list of V P match with the threshold values of the parameters.A set of connected bipartite graphs is thus constructed for thunderstorm days.The thunderstorm days are plotted on two sets of bipartite graphs and their adjacency matrices are formed.The eigenvalues of the bipartite graphs are computed.Only the highest positive eigenvalues from each bipartite graph are taken as the measure of connectedness of the graph.The statistical method of conditional probability is used to establish that the threshold ranges of the parameters are the required ranges for the prevalence of thunderstorms (Table 3).The result shows that the threshold value have higher probabilities than the other values (Figure 3).The conditional probability thus supports that the selected threshold values of the parameters are optimum for the occurrence of thunderstorms.The two sets of eigenvalues, one corresponding to the bipartite graphs satisfying the threshold values of the parameters and the other satisfying the values other than threshold ranges, are computed and plotted (Figure 4).The eigenvalues computed from the bipartite graph model with the threshold values of the parameters as the inputs are then analyzed and classified into two categories of thunderstorms: severe and ordinary as per the record of IMD.The same analysis is done for nonthunderstorm days.Three distinctly different ranges of maximal eigen values are observed for severe, ordinary, and no thunderstorm events (Figure 5).These ranges are taken as the input values of the model and the result is validated with the observation of 2007.It is observed that the thunderstorms and their severity can be predicted with the model at 12 to 6 hours before the occurrence using the selected ranges of the parameters with high degree of accuracy.The prediction error is observed to vary within 2% to 4% for 12-to 6-hour forecast while the error varies within 16% to 46% for 24 hours forecast (Figure 6).The predicted errors are estimated using different ranges of eigenvalues (2.6 ± 0.12), (1.88 ± 0.09), and (1.26 ± .03),respectively, for three categories of thunderstorms.However, if, for a particular day, the maximal eigenvalue deviates from the given ranges, then the predicted error (%) is computed using the maximum deviation from the mean eigenvalues.
Prediction error (P.E) can be computed as The result thus reveals that the different ranges of maximal eigenvalue of the bipartite graph are related to the different categories of thunderstorms.The different categories of thunderstorms (severe, ordinary, and no thunderstorm) show significant differences with the corresponding ranges of eigenvalues.However, the statistical approach does not provide any information regarding the severity of thunderstorm, while bipartite graph connectivity approach can be a useful tool to measure the strength of thunderstorms.The model output is thus validated with the observation of 2007.The results reveal that 12-to 6-hour forecast with the maximal eigenvalues within the selected range (Table 4) has better accuracy than 24-hour forecast (Figures 6 and  7).Forecast skills, Probability of Detection (POD), and False Alarm Ratio (FAR) are computed with SSBG model output using the total data set (training data and test data) from 1997 to 2007 for premonsoon days (April-May) as input to the model (Figure 8).Result shows that the POD for severe and ordinary thunderstorms is, respectively, 97% and 91% while FAR is 3% and 7%, respectively.However, for nonthunderstorm events the FAR computed with SSBG model output is observed to be 9% and POD is 86%.Thus, according to the forecast skill analysis, out of 100 no thunderstorm events, there will be 9 false alarms and 91 times the model does not give any false positive alarm (Figure 9).The prediction error (PE) calculated for the total data set is 3.2%.Thus, SSBG model provides 96.8% accurate forecast with the 11-year data set (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007).
Advances in Meteorology The day-to-day variation of input parameters (CAPE, CIN, Temperature, and Relative Humidity) and the reported thunderstorms are shown using the contour plot keeping the days in X-axis and corresponding eigenvalues in Yaxis (Figure 10).The severe thunderstorms are marked with green, ordinary are marked with red, and no thunderstorms event are marked with white dots.The eigenvalues are computed for the month of April, 2000.Two severe thunderstorms (15th and 24th April) and four ordinary thunderstorms (12th, 21st, 23rd and 29th April) occurred in the month of April, 2000 over Kolkata (22 • 32 N, 88 • 20 E) are considered.The eigenvalues corresponding to the different categories of thunderstorms are observed to follow the assigned pattern (Figure 5).Satellite-based lightning data are also taken into account for the analysis of the severity of thunderstorm along with the surface wind speed and Radar observations.The lightning frequency accompanied with severe thunderstorm (May 28, 2006) and ordinary thunderstorm (May 20, 2006) over Kolkata is depicted in Figures 11 and 12. Lightning frequency is higher on severe thunderstorm day during the movement of squall line over the station (Figure 11) while very few lightning events occurred on ordinary thunderstorm days.4).

Applicability and Limitations
The study shows that the threshold values of the meteorological parameters are optimum for the prevalence of severe thunderstorms over Kolkata (22 • 32 N, 88 • 20 E).SSBG model is capable of providing nowcasting for ordinary and severe thunderstorms.However, the limitation of the model is that it has to be a single spectrum model.If the positions of the vertices are changed, then the model output will be changed.It is obvious because change in the vertex position leads to change in the adjacency matrix input and consequently the eigenvalues, rank, and so forth will be different.Using the single spectra bipartite graph model three well distinct maximal eigenvalues for different categories of thunderstorms, severe, ordinary, and no thunderstorm are obtained.The model is developed particularly to forecast the thunderstorm of premonsoon season over Kolkata.The SSBG forecast model is trained with the ten-year (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) data during a particular time (April-May) to select the threshold values of the input parameters.The model can be used to forecast the thunderstorms of any other season or places provided that the threshold values of the input parameters are selected properly for the region.

Conclusion
The present study using SSBG model leads to state that the threshold values of the selected meteorological parameters are suitable for forecasting thunderstorms over Kolkata.The maximal eigen values of the bipartite graphs corresponding to severe, ordinary, and no thunderstorm days follow distinctly different patterns, and for a particular day if maximum eigen values are computed using the selected parameters and matches with the patterns then the occurrence and severity of the thunderstorms can be predicted with 96% to 98% accuracy with 12-to 6-hour lead time.

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Conditional probabilities with the threshold values and the values other than threshold ranges for thunderstorm days.

Figure 6 :
Figure 6: Diagram showing the predicted errors for 12-to 6-hour and 24-hour forecast using SSBG model for 2007 validation.

Figure 7 :
Figure 7: Validation diagram showing the variation of observed and predicted eigenvalues with the single spectrum bipartite graph model for the thunderstorms of 2007.

Figure 10 :
Figure 10: The day-to-day variation of input parameters (clockwise: CAPE, CIN, Temperature, and Relative Humidity) along with thunderstorm cases like severe (marked green), ordinary (marked red), and no thunderstorm and calculated eigenvalues for the month of April, 2000.

Figure 11 :
Figure 11: Satellite images showing the higher number of lightning events (marked region) associated with the severe squall line advancement in May 28, 2006 (Severe thunderstorm day).

( 2 )Figure 12 :
Figure 12: Satellite images showing the lower number of lightning events (marked region) associated with the ordinary squall advancement in May 20, 2006 (Ordinary thunderstorm day).

Figure 15 :
Figure 15: Forecasting results of the SSBG model (10 hours in advance) using: (a) atmospheric soundings; and (b), (c), (d) Doppler images of sequential advancement of a squall line in May 5, 2007.

Figure 16 :
Figure 16: Forecasting results of the SSBG model (12 hours in advance) using (a) atmospheric sounding; and (b), (c), (d) Doppler images of sequential advancement of an ordinary thunderstorm in May 22, 2007.
The SSBG forecast model is validated with actual observations of 2007 thunderstorm events using satellite images (INSAT, KALPANA-1 IR), Doppler weather radar images, and India Meteorological Department (IMD) observations.A severe thunderstorm (May 21, 2007) with surface wind speed of 81 km/h and an ordinary thunderstorm (May 22, 2007) with surface wind speed of 50 km/h are predicted with lead time 10 hours and 12 hours, respectively, with the SSBG forecast model.The forecast of thunderstorms (April 9, May 3, May 12 and May 28, 2007) with the model are successfully validated with Indian Meteorological Department (IMD) observations (Table Figure 13 shows the INSAT IR sequential images (KALPANA-1) in May 21, 2007 (severe thunderstorm day).An intense convection is observed at 11 to 13 UTC over Kolkata and the surrounding areas.The cloud top temperature is observed to be −70 • C. The satellite images show isolated cloud patches at 12 UTC in May 22, 2007 (ordinary thunderstorm day) which is being intensified at 15 UTC over Kolkata and adjoining areas (Figure 14).The cloud top temperature is observed to be −58 • C. Doppler Radar images also support the SSBG model forecast (Figures 15 and 16 ).

Table 1 :
Threshold values of the parameters.

Table 2 :
Mean and standard deviation of normal probability distribution for the selected ranges of parameters before the occurrence of thunderstorms over Kolkata.

Table 4 :
Validation of the single spectrum bipartite graph model with the observation of 2007.