Stochastic Modeling of Rainfall in Peninsular Malaysia Using Bartlett Lewis Rectangular Pulses Models

Three versions of Bartlett Lewis rectangular pulse rainfall models, namely, the Original Bartlett Lewis (OBL), Modified Bartlett Lewis (MBL), and 2N-cell-type Bartlett Lewis model (BL2n), are considered. These models are fitted to the hourly rainfall data from 1970 to 2008 obtained from Petaling Jaya rain gauge station, located in Peninsular Malaysia. The generalized method of moments is used to estimate the model parameters. Under this method, minimization of two different objective functions which involve different weight functions, one weight is inversely proportional to the variance and another one is inversely proportional to the mean squared, is carried out using Nelder-Mead optimization technique. For the purpose of comparison of the performance of the three different models, the results found for the months of July and November are used for illustration. This performance is assessed based on the goodness of fit of the models. In addition, the sensitivity of the parameter estimates to the choice of the objective function is also investigated. It is found that BL2n slightly outperforms OBL. However, the best model is the Modified Bartlett Lewis MBL, particularly when the objective function considered involves weight which is inversely proportional to the variance.


Introduction
Stochastic models can be used to generate rainfall across a range of different timescales for the purpose of reservoir design, flood studies, and design of sewerage systems.Recent developments in the area of hydrological modeling emphasize on characterizing rainfall processes at a single site or an areal average, as described in several works by Rodriguez-Iturbe et al. [1,2], Cowpertwait [3,4], Cowpertwait et al. [5], Onof et al. [6,7], and Agizew and Michel [8].These authors found that the rectangular pulse Poisson process models, particularly, the Bartlett-Lewis models, are successful in describing the rainfall processes for a wide range of temporal scale for countries such as UK, America, and Australia.The nature of rainfall in Peninsular Malaysia which consists of heavy rainfall in a short duration and light rainfall in a long duration can be well explained by some stochastic models.However, not much works have been done on the application of Bartlett Lewis models for rainfall occurrence in Peninsular Malaysia.In this study, three different forms of Bartlett Lewis rectangular pulse rainfall models are fitted to the data which are available at Petaling Jaya rainfall gauge station in Peninsular Malaysia for the period from 1971 to 2008 using the generalized method of moments.In addition, a comparison of the performance of the models is made with the allowance for two objective functions considered under the generalized method of moments approach.For illustrative purposes, the analysis for July and November is provided since these two months are typical for the two different monsoons periods in Peninsular Malaysia.a Poisson distribution with parameter λ.Within each storm, cells arrive following a Poisson distribution with rate β.The generation of rectangular cells within a storm (cluster) stops after a given time.This storm duration is exponentially distributed with parameter γ.Each cell arrival is associated with a rectangular pulse which has an exponentially distributed duration with parameter η and an exponentially distributed depth of mean μ x .The number of cells per storm follows a geometric distribution with mean μ c = 1 + β/γ.Thus, the OBL is a model which can be characterized by the set of parameters (λ, μ x , η, β, γ), as given by Rodriguez-Iturbe et al. [1].

The Modified BL (MBL)
Model.The Modified Bartlett-Lewis rectangular pulses model (MBL) was considered in many works due to its wide applicability for describing various different climates.The diagrammatic explanation of the Modified Bartlett-Lewis rectangular pulses model (MBL) is depicted in Figures 1(a) and 1(b), and the assumptions for the model are as follows: the storm origins are assumed to follow a Poisson process with rate λ and the cell origins follow a Poisson process with rate β.Cell arrivals terminate after a particular time, and this length of period is exponentially distributed with parameter γ.Each cell has a duration which is exponentially distributed with parameter η.The distribution of the uniform intensity is typically assumed exponential with a parameter μ x .For each storm, the parameter η is randomly varied from storm to storm with a gamma distribution with shape parameter α and scale parameter ν, such that E(η) = α/ν and var(η) = α/ν 2 .Subsequently, parameters β and γ also vary in a manner that the ratios κ = β/η and φ = γ/η be constant.Therefore, a 6 parameter MBL is described by the set of parameters (λ, μ x , α, ν, κ, φ).The equations of the BL model, in its original or the modified (random parameter) configuration, may be found in the appropriate references, such as Rodriguez-Iturbe et al. [1].

2N-Cell-Type Bartlett Lewis Model (BL2n
).The original and modified Bartlett Lewis Models only allow for the existence of one type of rectangular pulse, that is, one type of cell.Therefore, the parameter estimates for the intensity and duration of the cells in Bartlett Lewis Models are likely to be average values over the various types of precipitation that can occur in the same precipitation field.So, another possible modification to the original Bartlett Lewis Model (OBL) involves classification of more than one type of cells.The simplest possible model uses only one cell type but the literature suggests that the precipitation field can be broadly classed into two types: convective and stratiform precipitations.Many climates tend to experience both types precipitations, as shown in Figure 2. BL2n can be characterized by two random variables which represent the intensity of two types of cell, denoted as X i , i = 1, 2. The random variables X i 's follow an exponential with means μ xi , i = 1, 2. The duration of ith cell is exponentially distributed with parameters η i , i = 1, 2. The probabilities of occurrence for the cell type is denoted by ϕ i , i = 1, 2 such that 2 i=1 ϕ i = 1.Thus, the 2cell-type Bartlett Lewis Model (BL2n) is described by the set of parameters (λ,

Study Area and Input Data
In this study, hourly rainfall data were obtained from Petaling Jaya rain gauge station which is located in the midlands of Peninsular Malaysia.The hourly data ranging from the period of 1970 to 2008 were collected from the database of Malaysian Meteorological Service (MMS).This city experiences an equatorial climate and is very much influenced by the monsoons.Rainfall at Petaling Jaya is characterized by stratiform rainfall which occurs from December to February during the northeast monsoon and from May to August during the southwest monsoon, while the convective rainfall occurs during the months of March to April and September to November, which are the two intermonsoon periods.In particular, the intermonsoon periods are distinguishable by the higher mean hourly rainfall, as shown in Figure 3.Moreover, we can also see that the mean hourly rainfall for July is the lowest when compared to the other months of the year while for the period of August to December an increasing trend can be observed.July is the driest month of  the year for the Peninsular.The autocorrelation coefficient of lag 1 (ACF( 1)) for the hourly data is relatively greater than 0.35; that is, ACF(1) > 0.35 for the months of November to February as compared to the other periods of the year.This is not surprising since many locations in the country, particularly, northeast areas, experience heavy rainfall over a long duration during this period of time.

Model Fitting
There is some difficulty in applying the standard techniques of the Bartlett Lewis model such as maximum likelihood method for estimating the parameters of model since the likelihood is not available in a closed form.Estimation of parameters in these models is usually carried out using a generalized method of moments (GMM).Specifically, let θ = (θ 1 , θ 2 , . . ., θ p ) be the parameter vector for the model given the observed rainfall data y, let T(y) = (T 1 (y), T 2 (y), . .., T k (y)) be a vector of summary statistics computed from the data, and let τ(θ) = (τ 1 (θ), τ 2 (θ), . . ., τ k (θ)) denote a vector of the fitted value of T under the model.The idea behind the method of moments is to choose θ to minimize the objective function given by where w is a k × k positive definite weighting matrix of "weights" which is determined based on historical data.A special case of (1) can be given by The parameter estimates are estimated using the software by Chandler and Lourmas [9].

Model Performance
The three Bartlett Lewis models considered require the assumption that the historical time series is stationary, as suggested by Cox and Isham [10].Therefore, based on Rodriguez-Iturbe et al. [11], we calibrate the models for each month separately so that monthly rainfall forms a stationary time series.In addition to comparison of the model fits, we have also explored the effect of using different objective functions of the form (2). The two objective functions considered have different weights.The first objective function involves the weight which is based on the ith diagonal elements of the matrix W, given by w i = 1/ Var (T i (y)), where Var(T i (y)) represents the variance of the ith statistics across all years.The second objective function involves the weight which is 2 , where T i (y) represents the mean of the ith statistics across all years.These two objective functions are denoted by OF1 and OF2.In assessing the different models and objective functions, the criteria that are considered include the reproduction of fitting properties and sensitivity of the parameter estimates to the choice of objective functions.

Results and Discussion
Three models have been fitted for the data of each month of the year across all the years using the two objective functions considered.For illustration of the results, we provide the analysis for the months of July and November, as representing results for the dry and wet periods, respectively.The values of weights for the two objective functions with respect to the different properties and time scales that are considered for July and November are given in Table 1.

Reproduction of Fitting Properties.
As a check on the minimization, it is worth comparing the values of objective functions for each nested models.Both models 2 and 3 are extensions of model 1.Based on each objective function, the optimal values, that is, the minimum values attained for each model, for the months of July and November are reported in Table 2.For the month of July, the optimal values are found smallest for model 3 when OF2 is considered; however, the optimal values are found smallest for model 2 when OF1 is considered.On the hand, for the month of November, we found that the optimal values are smallest for model 3 for both objective functions.Therefore, it is not very clear for the choice of the best model based on the optimal values that are found.In addition, although we observed a lower optimal value for model 3 for the month of November, it is not always guaranteed that it would be the case since model 3 is not an extension of model 2.
To compare the models based on the influence of objective function, as suggested by Wheater et al. [12], it is useful to calculate the objective function thresholds.Unfortunately, due to numerical instabilities, it has not been possible to calculate the thresholds for model 3, and this indicates that model 3 may be overparameterised.The 95% thresholds found for model 2 based on July data are 23.2092 and 1.65164 for OF1 and OF2, respectively.These thresholds are similar to those obtained from model 1, and it therefore seems that there is little justification for using models with more than 6 parameters, which is model 2. The thresholds for model 2: 95% thresholds are 25.05906 and 2.730417, respectively for November.The objective function values give an overall measure of model fit, but do not indicate how well the models reproduce the particular properties.Tables 3 and 4 show all the observed fitting properties for July and November, respectively, for all the models using FO1 and FO2.
When the fitted means are compared to the observed means for all the months, it is found that the fitted statistics are generally larger.This could possibly be due to many extreme values in the data set due to the large readings of rainfall amount.There is a tendency for the models to underestimate the 6-hour variance and overestimate the 24hour variance in both months.FO2 is found to be more accurate than FO1 in estimating the variance since the estimated statistics are closer to the observed values under FO2, particularly for model 2. It is also clear from Tables 3 and 4 that, however, there are no remarkable results in the fitted autocorrelations and probability of wet when the different time scales are considered.Overall, model 2 is considered as the best model and it will be referred for further discussion.The plots of the fitted properties for the original and modified models against the observed properties are shown in Figure 3.

Effect of Objective Function on Parameter
Estimates.The sensitivity of the estimated parameters to the choice of objective functions is presented using the estimates for model 2.
The model fitting in this stage is mainly to ensure the stability in the numerical optimization of the objective function.The parameter estimates under each of the objective functions are given in Tables 5 and 6 for July and November, respectively.The effect of reweighting the fitting properties can be seen in moving from OF1 to OF2.Overall, the parameter estimates of both July and November are more accurate under OF1 than OF2 because the lengths of the intervals are found shorter.

Conclusion.
Short duration heavy rainfall and long duration light rainfall often occur in Peninsular Malaysia, and it seems that a good fitting to standard statistics is unlikely to be reproduced adequately using the original form of Bartlett Lewis model (OBL).A new modified Bartlett Lewis stochastic rainfall model known as 2n-cell-type Bartlett Lewis model (BL2n) is considered.The BL2n model structure is an extension to the Original Bartlett-Lewis model, which allows for more than one type of cells.The hourly rain gauge data obtained from Petaling Jaya was applied in model fitting for all months.On the basis of comparison for dry and wet periods, July and November were considered.Two different sets of weights, based on the reciprocal of the ith diagonal elements w i of the matrix W in (2), that is, 1/ Var(T i (y)), and based on the inverse of the squared of mean values, two objective functions were considered for the model fitting.The BL2n model outperforms the OBL possibly due to the nature of rainfall in Malaysia which consists of two types, that is, convective and stratiform.However, the best model is the Modified Bartlett Lewis MBL since it is able to reproduce the fitting properties better than other models.Overall, the parameter estimates of both July and November under objective function 1 are more accurate than objective function 2. There is also a tendency for parameters to be less well identified in July than in November.

2. 1 .Figure 1 :
Figure 1: (a) Explanatory sketch for the structures of rainfall storms.(b) Explanatory sketch for the parameters of Bartlett-Lewis rectangular pulses model.

Figure 2 :
Figure 2: Two types of cells.

Figure 3 :
Figure 3: Observed and fitted properties for the original and modified models.

Table 1 :
Fitting properties and weights for the two objective functions for July and November.

Table 2 :
Optimal values based on the two objective functions for each model fitted to July and November data.

Table 3 :
Observed rainfall statistics for July with corresponding properties for the fitted models.

Table 4 :
Observed rainfall statistics for November with corresponding properties for the fitted models.

Table 5 :
A comparison of results found using OF1 and OF2 on the parameter estimates and 95% confidence intervals for data of July based on model 2.

Table 6 :
A comparison of results found using OF1 and OF2 on the parameter estimates and 95% confidence intervals for data of November based on model 2.