Evaluating Correlations and Development of Meteorology Based Yield Forecasting Model for Strawberry

California state is among the leading producers of strawberries in the world. The value of the California strawberry crop is approximately $2.6 billion, which makes it one of the most valuable fruit crops for the state and nation’s economy. California’s weather provides ideal conditions for strawberry production and changes in weather pattern could have a significant impact on strawberry fruit production. Evaluating relationships betweenmeteorological parameters and strawberry yield can provide valuable information and early indications of yield forecasts that growers can utilize to their advantage. Objectives of this paper were to evaluate correlations of meteorological parameters on strawberry yield for Santa Maria region and to develop meteorology based empirical yield forecasting models for strawberries. Results showed significant correlation between meteorological parameters and strawberry yield andprovided a basis for yield forecastingwith lead time. Results fromempiricalmodels showed that cross-validated yields were closely associated with observed yield with lead time of 2 to 5 months. Overall, this study showed great potential in developing meteorology based yield forecast using principal components. This study only looked at meteorology based yield forecasts. Skills of these models can be further improved by adding physiological parameters of strawberry to existing models for strawberry.


Introduction
California produces 88% of nation's fresh and frozen strawberries.The value of the California strawberry crop is approximately $2.6 billion, which makes it one of the most valuable fruit crops for the state and nation's economy.Favorable climate conditions and technological advancements among other factors support strawberries to be approximately four times higher than other production areas within and outside United States.According to [1], since 1990 strawberry acreage has approximately doubled and is projected to increase due to high value and favorable conditions.
Since strawberry is a high value crop with fruit production spread over several months, proper agronomic practices are important to ensure optimal yields.Additionally, environmental factors can play a very important role during the growth and development of strawberries.Water management is also an important aspect of strawberry production not only for plant growth and yields but also for leaching out of salts from the root zone.Avoiding water stress is also critical for reducing the damage from twospotted spider mite (Tetranychus urticae), a major pest of strawberry.
Apart from water management, one of the major challenges in strawberry production is impacts and control of pests and diseases.The western tarnished plant bug (Lygus hesperus) and twospotted spider mite are two major pests of strawberry, which cause significant yield losses [2].Spider mites thrive under warmer and dryer conditions.Such conditions also promote the migration of the western tarnished plant bug to strawberries and other cultivated hosts from wild hosts in the surrounding areas.Additionally, many pests have shorter life cycles under warmer conditions and their populations build up rapidly.Diseases such as charcoal rot (Macrophomina phaseolina), Fusarium wilt (Fusarium oxysporum f. sp.fragariae), Phytophthora crown rot (Phytophthora spp.), and Verticillium wilt (Verticillium dahliae) are a challenge in strawberry production especially in the absence of the fumigant, methyl bromide.

Advances in Meteorology
Despite these challenges, California's Mediterranean climate offers ideal weather conditions for both nursery plant and strawberry fruit production.Transplants are produced in high elevation nurseries in northern California where cold temperatures allow nursery plants to go through cold hardiness and accumulate carbohydrates in the crowns for optimal growth in the fruit production fields.Fruits are produced in three main regions on the Central Coast where cool nighttime conditions are conducive for flower production and mild daytime temperatures are ideal for plant and fruit development.Additionally, as majority of the rainfall is during the winter months before the peak fruit production season, they do not typically interfere with fruit production.Variations in weather conditions in three strawberry production areas in California complement fruit production from each other and help avoid market glut.The warmer Oxnard area, the milder Santa Maria area, and the colder Watsonville area with minimal overlapping of their peak fruit production seasons allow yearlong strawberry production.
Weather influence on strawberry has been documented in various studies.For instance, [3] examined strawberry yield efficiency and its correlation with temperature and solar radiation and found strawberry yield was significantly correlated with solar radiation.There are various other studies that showed the importance of solar radiation in strawberry growth and development overall [4][5][6].Studies by [7,8] studied impacts on strawberry under high humidity.Study [9] evaluated relationships of various crops in California including strawberries with weather parameters.
Changes in weather pattern could have a significant impact on strawberry fruit production, timings, and ultimately the market value.Analyzing influence of weather information on strawberry yield and utilizing it to provide yield forecast early in the season may provide an opportunity to tailor agricultural practices for higher yields and profits.There are potential benefits of using climate information on decision-making processes in agriculture as a way to adapt to climate variability [10][11][12][13].Crop growth is weather dependent and thus it is a common practice to predict crop yield based on weather variables [14][15][16][17].Since strawberry production spreads across 4-5 months, evaluating relationships between meteorological parameters and strawberry yield can provide valuable information and early indications of yield estimations that growers can utilize to their advantage.
Objectives of this paper are to evaluate correlations of meteorological parameters on strawberry yield for Santa Maria region and to develop meteorology based empirical yield forecasting models for strawberries.

Strawberry Yield
Data.This paper is focused on strawberry yield data for Santa Maria region of California.Twothirds of the total strawberry production acreage is located in the Central Coast and Santa Maria Valley.These regions encompass the coastal regions of Santa Cruz, Santa Clara, Monterey, San Luis Obispo, and northern Santa Barbara counties [18].According to [19] two primary fall planted cultivars grown in Santa Maria strawberry production district are "San Andreas" and "Monterey," accounting for 39.9% and 32.9% of the district, respectively, for 2016.Additionally, "San Andreas" cultivar accounted for 22.4%-39.9% of the district and "Monterey" cultivar accounted for 2.3%-32.9% of the district during last five years (2012-2016).Strawberry is an annual crop with plants first grown in the nurseries and then transplanted into the fields.For Santa Maria region, transplanting typically occurs between late July and September.Strawberry production from April to July accounts for most of the yearly strawberry production with the peak production typically during the month of May.Since April-July accounts for most of the yearly values, this paper is focused on yield analysis for this time period.
Daily strawberry yield data for Santa Maria county was obtained from the California Strawberry Commission's website [19].This information is publically available and is originally compiled from the United States Department of Agriculture Market News/Fruits & Vegetables website [20].Daily strawberry yield data for the month of April through July were aggregated to weekly values.For this analysis we used weekly strawberry yield data for 2009 through 2015.While working with large number of historical yield data, it is important to examine if there is a significant upward trend in yield over time, which could be due to technological improvements over time.Since the number of years used in this study was relatively low, historical strawberry yield data obtained from [19] were directly utilized for correlation analysis and yield forecasting model development.

Meteorological Parameters.
Meteorological data were obtained from the California Irrigation Management Information System (http://www.cimis.water.ca.gov/), a network of over 145 automated weather stations in California.Specific meteorological parameters used in this study were net radiation, air temperature (minimum and maximum), relative humidity (minimum and maximum), dew point temperature, soil temperature (minimum and maximum), vapor pressure (minimum and maximum), reference evapotranspiration, and average wind speed.
Total incoming solar radiation from the CIMIS station was measured using pyranometers, which was then used in the calculation of net radiation.Air temperature data is measured at a height of 1.5 meters above the ground using a thermistor.Instead of using average temperature, minimum and maximum temperatures averaged over a weekly period are used.Daily temperature data obtained from the CIMIS station is aggregated at a weekly time scale for correlation and model development purposes.Soil temperature data are collected at 15-centimeter depth below ground using a thermistor with resistance that varies with temperature.Minimum and maximum soil temperatures averaged over a weekly timescale were used for this study.Relative humidity is defined as the amount of water vapor present in air expressed as a percentage of the amount needed for saturation at the same temperature.The relative humidity sensor is sheltered in the same enclosure with the air temperature sensor at 1.5 meters above the ground.Relative humidity is a very important meteorological parameter that can impact fruits such as strawberry.This is because relative humidity is also a good indicator of pests and diseases to which strawberry yield is highly sensitive.In this study, minimum and maximum relative humidity averaged over a weekly timescale have been utilized.Wind speed used in this study was obtained through the CIMIS station that is measured using threecup anemometers at 2.0 meters above the ground.There is a published result documenting the impacts of wind speed on strawberry yields [6].Wind speed on a weekly time scale was utilized to analyze its impacts on strawberry yield.Vapor pressure of the atmosphere is the partial pressure exerted by atmospheric water vapor.It is a calculated parameter from relative humidity and air temperature data.Reference evapotranspiration is evapotranspiration from standardized grass (ET  ).The CIMIS ET  and ET  values are calculated using the modified Penman equation.Since ET has direct influence on crop growth, ET  information was utilized in this study.Weekly meteorological data for this study was obtained from the CIMIS for the duration of 2007-2015.

Correlation Analysis.
Correlation analysis between meteorological parameters and strawberry yield was performed using the Pearson product-moment correlation.This is a widely used methodology to measure linear dependence between two variables.In this case, linear dependence was tested between meteorological parameters and strawberry yield.
Weekly values of meteorological parameters from October of the year prior to harvest to February of current year of strawberry harvest were correlated with weekly strawberry yield from April through July and tested for significance at  < 0.05.Each meteorological variable was correlated with strawberry yields from April to July.This thorough correlation analysis was done in order to understand influence of meteorological parameters on strawberry yield on a more detailed basis.Meteorological parameters that exhibit significant correlation with strawberry yield were then used to develop empirical model to forecast strawberry yields.

Principal Component Regression.
Meteorological parameters utilized as independent variables to develop empirical relationship to forecast strawberry yields exhibit colinearity.Typically, meteorological parameters exhibit significant correlations.If these explanatory variables were utilized directly into regression models, it would violate the assumption of nonconlinearity of explanatory variables.Use of principal component regression has multiple benefits.It can reduce the number of explanatory variables utilized in the model significantly.This is specifically important and useful if we have high correlation among the explanatory variables such as that for meteorological parameters.Another advantage is that the principal components are mutually independent and thus solve the issue of multicolinearity in regression models.
Instead of using meteorological parameters as explanatory variables, principal component regression uses principal components derived from these meteorological parameters.The dependent variable for this model was weekly strawberry yield and independent variables were principal components of meteorological parameters.The general form of model is as follows: where  is predicted weekly strawberry yield,  1 ⋅ ⋅ ⋅   are principal components of meteorological parameters,  1 ⋅ ⋅ ⋅   represent estimated parameters for corresponding principal components, and  represents residual error.

Cross
Validation.This is a widely utilized statistical method to test model's validity with independent dataset.There are various forms of cross validation where iteratively certain size of data is used for training and rest of them is used for evaluation.With leave one out cross validation approach, observed data are iteratively and exhaustively used for model testing, resulting in more reliable evaluation than getting estimates from the two-group partition method and less biased than estimates derived from calibration-dependent dataset [21].This approach is specifically more efficient when there is limited observed dataset available.

Correlation Analysis.
Table 1 shows statistically significant correlation of meteorological parameters with strawberry yield for Santa Maria region.It is evident from this analysis that the fall and winter weather conditions have significant influence on strawberry yields during their peak season, that is, during the month of May through July for Santa Maria region.This lagged correlation indicates potential for forecasting strawberry yields with the lead time of two to five months with acceptable level of accuracy.Net radiation during the fall season generally showed positive correlation with late season strawberry yield.Solar radiation has direct impact on strawberry growth and development, as it is the source of energy that strawberry plant utilizes during photosynthesis.
Results show that the relative humidity during the month of October is positively correlated with peak strawberry yields whereas the relative humidity during the month of January is negatively correlated with strawberry yields.Vapor pressure which is calculated based on relative humidity also showed similar correlation trend with strawberry to that of relative humidity.It has been documented in the literatures that the increase in relative humidity tends to increase fruit weight.It is also associated with increased leaf expansion and increase in photosynthesis, which can justify positive correlations with strawberry yields.However, high humidity could also result in tip burn for strawberry plants [7,8], which could reduce strawberry yield.
Soil temperature during the fall time showed positive correlations with strawberry yield.This could be due to the fact that soil temperature during the early stage of strawberry might provide favorable conditions for plant establishment.However, soil temperature during January and February showed negative correlations with strawberry yields.Dew point temperature, that is, a temperature at which dew can start to form, during the fall season showed positive correlation with June and July strawberry yields.If dew point goes down, there are increasing concerns of frost damage to crops and thus the higher the dew point, the lower the risk for strawberry plants.Wind speed during December and January was negatively correlated with strawberry yields.Excessive wind speed can create bruising on the leaves and could impact strawberry yields.These findings are consistent with the literature.For instance, [6] found 56% increase in the yield of the strawberry with reduction in mean wind speed from 1.6 m/s to 1.1 m/s.
It is evident that many meteorological parameters during the early stages of strawberry growth and development phase exhibit statistically significant correlation with strawberry yields from April to July.This finding is consistent with what [9] studied for strawberry and other crops in California.They examined correlations at state average strawberry yield data on a yearly time scale.This study analyzed correlations on weekly timescale and also developed principal component models to provide weekly strawberry yield forecasts with the lead time of 2 to 5 months.2 show the predictability measures of weekly strawberry yield using meteorological parameter based principal component regression models.Figure 1 shows observed versus predicted yields on 1 : 1 line and good agreement between observed and predicted strawberry yields can be observed.The root mean squared error (RMSE) between observed and cross-validated strawberry yield is 747 kg/ha, 627 kg/ha, 518 kg/ha, and 384 kg/ha for April, May, June, and July, respectively.These agreements between observed and predicted strawberry yields are also statistically significant at 0.05 probability level.Skills of these forecasts are higher for the month of June compared to other months.This is because higher number of meteorological parameters exhibited significant correlations.However, given the fact that these forecasts are obtained with 2 to 5 months of lead time, these empirical models showed potential for early estimates on expected yields.

Yield Forecasting. Figure 1 and Table
It is important to note that there are limitations on how much variability in yield data that can be explained by meteorological parameters as many other factors such as management practices, pests, and diseases can also significantly impact yield variability.Additionally, strawberry yield data obtained from California Strawberry Commission provides an average estimate for Santa Maria region.That may add some uncertainty in calculation.
In this study we explored the use of meteorological parameters in developing and testing forecasting models that can provide yield forecasts with certain lead time, which can  enable growers to make strategic decisions.This study only looked at meteorology based yield forecasts.Skills of these models can be further improved by adding physiological parameters of strawberry to existing models for strawberry.Additionally, there are various other forecasting approaches documented in the literature.Efforts should be made to compare these various approaches to enhance forecasting skills as well as increase the lead time of yield forecasts.

Conclusions
This study analyzed correlations on weekly timescale and also developed principal component models to provide weekly strawberry yield forecasts with the lead time of 2 to 5 months.Several meteorological parameters exhibited significant correlations with strawberry yields.Principal component regression models developed using meteorological parameters provided promising strawberry yield forecasts for Santa Maria strawberry production region.Agreement between observed strawberry yield and cross-validated yield forecasts was statistically significant for April through July.Future research could evaluate skills of empirical models that combine both meteorology and agronomic variables.

Figure 1 :
Figure 1: Observed and cross-validated strawberry yield forecasts for Santa Maria region for April (a), May (b), June (c), and July (d).

) 1 ) 1 )
M a ys t r a w b e r r yy i e l d( k g h a −June strawberry yield (kg ha −July strawberry yield (kg ha−1