Predicting Winter Wheat Grain Yield Using Fractional Green Canopy Cover (FGCC)

Optical sensors have grown in popularity for estimating plant health, and they form the basis of midseason yield estimations and nitrogen (N) fertilizer recommendations, such as the Oklahoma State University (OSU) nitrogen fertilization optimization algorithm (NFOA). *at algorithm uses measurements of normalized difference vegetative index (NDVI), yet not all producers have access to the sensors required to make these measurements. In contrast, most producers have access to smartphones, which can measure fractional green canopy cover (FGCC) using the Canopeo app, but the usefulness of these measurements for midseason yield estimations remains untested. Our objectives were to (1) quantify the relationship between NDVI and FGCC, (2) assess the potential for using FGCC values in place of NDVI values in the current OSU Yield Prediction Model, and (3) compare the performance of NDVI and FGCC-based yield prediction models from the collected dataset. *is project, implemented on 13 winter wheat sites over the 2019-2020 growing season, used a range of nitrogen (N) rates (0, 34, 67, 101, and 134 kgNha) to provide different levels of yield. Our results indicated that while NDVI and FGCC are highly correlated (r � 0.76), FGCC is not suitable for direct insertion into the current yield prediction model. However, a yield prediction model derived from FGCC provided similar estimates of yield compared to NDVI (Nash Sutcliffe Efficiency� −3.3). *is new FGCC-based model will give more producers access to sensor-based yield prediction and N rate recommendations.


Introduction
Sensor-based yield prediction technology can be an important decision support tool for producers across the United States, with the resulting near real time yield estimates allowing farmers to optimize fertilizer application rates. Yield predictions, and associated fertilizer recommendations, can be made using normalized difference vegetation index (NDVI) measurements collected using instruments mounted to farm equipment or handheld sensors. While such sensors provide valuable information, the costs and availability of NDVI sensors can deter producers from adopting them [1]. On the other hand, fertilizer recommendations based on fractional canopy cover (FGCC) rather than NDVI may be a more cost-effective option, and the Canopeo smartphone application [2] enables FGCC measurements and requires only a smartphone camera.
Increasing eutrophication in coastal waters and increasing nitrate levels in drinking water motivate steps to reduce nutrient losses from agriculture. One avenue explored is using optical sensors to assess plant health and for nutrient recommendations [3]. Lukina et al. [4] noted that NDVI measurements could be useful for in-season yield estimates, which could in turn be used to develop N rate recommendations through the nitrogen fertilization optimization algorithm (NFOA). Since the inception of the NFOA, NDVI has become one of the most often used measures of plant health [5]. NDVI is based on reflectance of near infrared light (NIR) and red light, ranging from 0 to 1, with higher values coming from healthier (i.e., greener) plants [6]: NDVI can be measured in a variety of ways, such as handheld sensors (Greenseeker, Trimble Agriculture; Crop Circle, Holland Scientific), sensors mounted to farm equipment, drones with cameras (Sentera, DJI, Micasense), and satellites [7]. However, NDVI is not the only option for measuring plant health. Canopeo, a free smartphone application developed by Oklahoma State University, is a tool that measures fractional green canopy cover (FGCC) using downward facing digital images [2]. Rather than quantifying the level of greenness, as is the case with NDVI, Canopeo is designed to measure canopy cover as the presence or absence of green vegetation above the soil surface. In this way, FGCC offers an assessment of crop health in terms of fractional canopy cover, with values ranging from 0 (no cover) to 1 (full cover), while different measurements, NDVI (greenness), and FGCC (canopy cover) are inherently linked [8,9].
While FGCC has been used to estimate yields of forage crops [10] and above ground biomass of row crops [11], few studies have quantified relationships between FGCC values and winter wheat grain yields. Goodwin et al. [12] predicted winter wheat grain yields using both NDVI and FGCC values, and they found relatively poor yield relationships for NDVI (r 2 � 0.28 to 0.49) and FGCC (r 2 � 0.14 to 0.45) using direct comparisons between sensor readings and yield. In that study, NDVI and FGCC were positively correlated with correlation coefficients of 0.87 at Feekes 5 and 0.73 at Feekes 6.
However, current yield prediction models used by Oklahoma State University do not directly correlate NDVI and grain yield. Instead, the in-season yield estimation tool (INSEY; Raun et al. [13]) estimates yield based on the ratio of NDVI and the number of growing days (denoted as GDD > 0) since planting (i.e., NDVI per day of growth): Here, GDD > 0 refers to the number of days with average air temperature above 4.4°C. is value is a threshold set for small grain crops (including wheat) where growth occurs [13]. e INSEY model is based on over 30 site-years of data, and the yield estimates using INSEY are stronger (r 2 � 0.54, [13]) than estimates based directly on NDVI (r 2 � 0.28 to 0.49, [12]).
While the correlation of NDVI and FGCC suggests that FGCC may have value for calculating INSEY, this has yet to be determined. erefore, our objectives were to (1) quantify the relationship between NDVI and FGCC, (2) assess the potential for using FGCC values in place of NDVI values in the current OSU Yield Prediction Model, and (3) compare the performance of NDVI and FGCCbased yield prediction models from the collected dataset. We collected sensor values from one year across 13 locations from winter wheat N rate trials across the state of Oklahoma at recommended sensor timing dates for optimum yield prediction [14,15]. is work was done to lay a framework by which FGCC can become a viable tool for grain producers to predict yield and make subsequent N management decisions.

Study Area.
is trial was conducted over 13 sites (6 at research stations, 7 on-farm locations) during the 2019-2020 growing season that spanned the wheat producing regions of Oklahoma ( Figure 1, Table 1). e climate of Oklahoma is diverse, ranging from humid subtropical climate in the southeast to semiarid climate classification in the north-west and panhandle region. [16]. ese sites have annual rainfall totals ranging from 478 to 932 mm and mean annual temperature ranging from 13.7 to 16.4°C [17].

Vegetation Image Data Collection and Analysis.
Vegetative sensing measurements of each plot were collected within 80-110 accumulated growing degree days of wheat growth (GDD > 0), which ranged from February 27 to March 29, 2020. is time, frame was chosen as it is where yield prediction has been found to be the most accurate for NDsVI [14,15]. e NDVI measurements were collected using a GreenSeeker (Trimble Agriculture, Westminster, CO, USA) approximately 0.6 m above the crop canopy surface. Digital images capturing an area of approximately 1.2 × 1.5 m were collected using a Samsung Galaxy S9 smartphone (Samsung Group, Seoul, South Korea). Nadir images were collected by holding the phone out at a 90°angle directly in front of the researcher at arm's length approximately 1.4 m off the ground for each plot. e image was then analyzed using the Canopeo tool in MatLab 2020b [18].
is tool estimates canopy coverage by classifying each pixel from an RGB image based on its color values. NDVI values in this experiment ranged from 0.24 to 0.77, and FGCC values ranged from 0.04 to 0.80, which spans nearly the range of possible ground cover values (0 to 1).

Grain Yield
Sampling. At physiological maturity, whole plant samples were collected from a 0.9 m × 0.9 m area in each plot via sickles. Samples were placed in forced air oven at 43°C for at least 24 hours, threshed using a small plot thresher to remove the wheat berry from chaff, and then weighed.

Yield Prediction Model.
e yield prediction model portion of the nitrogen fertilizer optimization algorithm, developed by Raun et al. [13], is derived from six equations. e first is for INSEY, or in-season estimate of yield (see (2)). e INSEY provides a value of growth as a rate, as NDVI value per GDD > 0, signifying increase in NDVI per day of growth under current growing conditions. INSEY is to be taken from the farmer practice strip (FPS), or area to which N will be applied to reach yield potential. is value is then input into the yield prediction model: 2 International Journal of Agronomy e YP 0 reflects the yield potential of the crop assuming no factors are changed (i.e., no other nutrients added, no drought stresses). is model also includes a standard deviation shift in the positive direction, to reflect yield potential. It is important to note that grain yield limiting factors occurring after sensing can cause differences between predicted and measured grain yields at harvest. Yield prediction provides a snapshot in time of that crop and does not take into account any postsensing stressors.
To predict yield, assuming an application of N occurs after sensing, at least two NDVI readings are required. ese come from areas where high rates of N are applied (N-Rich Strip) and another area outside the N-Rich Strip, where N is to be applied based off yield prediction model from the FPS. e ratio of the response index (RI) values (RI N-Rich : RI FPS ) reflects the relationship expected of both sensor values and yield. However, over time, data collection has shown the need to adjust that ratio [19] to reflect the difference in RI Sensor and RI Yield . e calculation for RI NDVI and RI Adjust is shown as follows: Yield (YP N ) is then predicted using where YP N is the yield prediction after N application. Fertilizer recommendations can then be calculated based upon the difference of YP N and YP 0 , seen in the following. While the N rate is a product of the NFOA, evaluating the accuracy of N rates derived from the NFOA is beyond the scope of this manuscript: 2.5. Statistical Analysis. Statistical modeling was conducted using trend analysis software in Microsoft Excel. Linear regression modeling was used to explore relationships of FGCC and NDVI, RI Sensor and RI Yield , and predicted and  International Journal of Agronomy 3 measured yield. Exponential regression was used to build the yield prediction models from both FGCC and NDVI. Nash-Sutcliffe Efficiency (NSE) was used to assess yield prediction models compared to the achieved yield of the study [20]. e NSE has been used to compare field observed data to predicted values in other studies evaluating hydrologic [21,22] and forage yield models [23], but we will use this value to assess grain yield prediction models. e NSE values range from −∞ to 1, with a value of 1 indicating a perfectly fitted model, a value of 0 indicating that the model performs only as well as a model that uses mean observed values as the predicted model output, and a negative value indicating the model performs worse than the observed mean. Figures were produced using package ggplot2 in R [24,25].

Relating Canopeo to NDVI.
Wheat producers often rely on optical sensors to gauge the N needs of their crop, but sensors that measure NDVI require special equipment that could be less accessible to many producers. On the other hand, smartphone-based technologies such as Canopeo that measure fractional green canopy cover (FGCC) are accessible to nearly all producers, but the effectiveness of using this technology for yield prediction is largely unknown. While NDVI and FGCC are grounded on inherently different technologies, we found that these measurements are strongly correlated (r 2 � 0.76, Figure 2).

RI Comparison and Adjustment.
A primary component of the NFOA is the response index (RI) of both yield and sensor values and the subsequent adjustment. e RI Sensor is the ratio of the sensor value from the N-Rich Strip and the sensor value from the farmer practice strip (FPS). e RI Yield refers to the ratio of the yield from the N-Rich Strip and the yield from the FPS. Due to the discrepancy between RI Yield and RI Sensor , an adjustment is necessary to accurately predict yield from sensor values. is adjustment, RI Adj , is described by the linear models portrayed in Figure 3 for both sensor types, as well as the current RI Adj used. e linear regression model derived from both the FGCC and NDVI data had coefficients of determination of 0.76 and 0.27, respectively. e slope of the FGCC derived model was less than 1 (0.55). e opposite was true for NDVI derived model, which had a slope of 2.49.

Yield Predictions with NDVI and FGCC.
For both sensor types, yield was predicted using the current NFOA and displayed in Figures 4(a)-4(d). Figure 4 displays the predicted yields from the current yield prediction model using FGCC data (a), FGCC data without the RI adjustment (b), FGCC data with new RI adjustment found with RI FGCC comparison (c), and the NDVI from the same dataset (d). It is important to keep in mind that predicted yield is expected to be higher than the actual yield (represented by being below the 1 : 1 line), as yield prediction represents a snapshot in time and assumes no yield limiting factors occur after sensing. at is, yield predictions represent the upper limit of potential yields. e strength of the predicted-achieved yield relationship was low (r 2 � 0.34, NSE � −35.5) when yield prediction was made by inputting FGCC into the current NFOA (Figure 4(a)). is caused increases in yield prediction that were not only not reflected by achieved yield, but also in some cases not possible to reach in Oklahoma environment (e.g., predicted: 21578 kg ha −1 , achieved: 2737 kg ha −1 ). Removing the RI Adjust led to increased accuracy ( Figure 4(b), r 2 � 0.33, NSE � −11.4), but it was still inaccurate. Adjusting the RI Sensor using the RI adjustment for FGCC from Figure 3   A yield prediction model was built from FGCC data using the same methods used to build the original NFOA model ( Figure 5). For this dataset, the FGCC built yield prediction model provided similar correlation (r 2 � 0.47) as the NDVI model (r 2 � 0.53).

Discussion
While FGCC derived from Canopeo is useful for estimating biomass yield [10,11], very few have used FGCC to develop yield prediction for grain crops, rather than just measuring biomass. Goodwin et al. [12] investigated using both NDVI and FGCC to estimate grain yield in winter wheat in Ohio at different growth stages and found that these values could account for the most variability in yield, as long as sensor readings were taken at or prior to Feekes 5 growth stage. While our trial used different methods of developing yield prediction models than Goodwin et al. [12], our results support their findings.
Our results found that there was a significant relationship between FGCC and NDVI. As these sensors measure two distinctly different variables (NDVI-greenness, FGCCcanopy cover), it is not expected to be a 1 : 1 relationship. e deviation from the 1 : 1 line indicates that FGCC should not be directly inserted in the NFOA in place of NDVI and doing so could skew yield predictions. Yet the relationship is strong, supporting the opportunity that FGCC could be utilized similarly to NDVI. e deviation between FGCC and NDVI values becomes apparent when investigating the RI Adj portion of NFOA. In Figure 3, we can see that the slopes of both NDVI and FGCC RI lines are much different. e NDVI RI adjustment line shows that the RI NDVI must be increased to reach the RI Yield , whereas RI FGCC would need to be reduced to reach RI Yield (Figure 3). e RI NDVI values from this trial produce an adjustment that is closer to what OSU currently utilizes in its NFOA than RI FGCC , as to be expected, as the OSU RI Adj is derived from NDVI values. e RI NDVI has outliers that veer from the regression line. ese points come from locations in which there was marginal difference in sensor values at sensing, yet provided very high response to the addition of N. e RI FGCC had a much stronger coefficient of determination (r 2 � 0.76) with RI Yield , which suggests that FGCC was more sensitive to differences between plots receiving N  Figure 4: Model depicting the predicted yields using sensor readings plotted against actual yield. ese points were developed from the FGCC readings (a), FGCC yield prediction without the RI adjustment (b), and FGCC with the new RI adjustment (c), and the NDVI from the same dataset (d). As expected, RI Sensor caused inaccuracies with yield prediction. Removing the RI adjustment pulled those values closer to 1 : 1 but was still not enough to be considered accurate. Adjusting RI Sensor with the new RI adjustment caused those values to become similar in accuracy as the NDVI derived yield prediction, with a couple of outliers, coming from locations with very high RI Sensor values. and those not. While the high r 2 value supports the opportunity that FGCC could be utilized in yield prediction models, due to the differences between FGCC and NDVI, directly using FGCC in the current NFOA would not produce accurate prediction values. As the YP N (see (5)) is calculated by multiplying YP 0 by RI Adjust , the drastic differences between the RI NDVI and RI FGCC would impact the yield predictions and most likely overestimate the yield response if using FGCC. is can be seen as when using FGCC to predict yield, which was highly inaccurate. Using FGCC and the new RI Adjust in the current NFOA provided the most accurate yield prediction compared to the achieved yield (Figure 4(c), NSE � −3.3, r 2 � 0.26). Using NDVI provided less accuracy to predict yield (Figure 4(d), NSE � −4.3, r 2 � 0.12). Yet, there were some outliers present in the data. is can occur when there are great differences between the sensor readings coming from the N Rich Strip and the FPS. Each of these locations had low NO 3 -N levels from the soil test analysis, which allows for greater response to preplant N.
Yet, there are still opportunities in which yield predictions could provide higher accuracy. is is by developing the yield prediction model from the dataset (Figure 5), using the same methods used to build the original NFOA [13]. is was attained by plotting INSEY (sensor value divided by GDD > 0, growth rate) against actual yield. Figure 5 displays a very similar model as the original NFOA models were built. e model has a good coefficient of determination (r 2 � 0.47), but not as high as the model derived from the NDVI data (r 2 � 0.53).
While the coefficient of determination values across the two models does not depict a very accurate model, this is to be expected, considering the wide range of locales and environments this study was executed in. Yield prediction models offer a snapshot in time of the highest level of yield attainable with current conditions. Any extraneous circumstances that occur after sensing can occur and decrease the yield ceiling. Dhillon et al. [14] reported high coefficient of determination values when investigating optimum sensing timings for yield prediction in Oklahoma but were on a subset of data that spanned 4 site-years, including the same trials that were used to develop the currently used NFOA. e dataset from our study spans 13 sites, 300 km, and 478-932 mm rainfall over one growing season [17]. Drought stress, freeze damage, weed pressure, and other circumstances could have decreased yield potential after sensing, but due to the amount of locations and their relative distance to each other, researchers were not capable of recording each event. is must be taken into account when creating a NFOA to serve large/variable areas. Previous work has shown that NDVI can be used to estimate winter wheat yield but requires regional data/equations to provide most accurate estimations [5,26]. Future works may consider creating multiple NFOA to better serve the area, which can increase region specific accuracy.
It is important to note the magnitude of response reported in this study, where RI Yield reached 10.1, or a 10-fold increase in yield from the 0 N check to full fertilized plots. Many nitrogen response studies have been conducted in Oklahoma over the past two decades and report an average "high" response of 2.0 [27][28][29][30]. While the exceptional high achieved yield response could be an artifact of varying locale or environment in one growing season, it is certainly uncommon and creates challenges when comparing the results to existing literature.
All FGCC readings were conducted using the same camera, by the same researcher, at the same height and orientation for every plot. In using current NFOA with NDVI, producers use handheld sensors that naturally, when held, are level with the ground, and the producer adjusts the height with the height of the crop. Implement mounted sensors are also mounted in level orientation, and also are adjustable to the height of the crop. If collecting FGCC using a smartphone, further work would need to quantify errors associated with handheld use, as well as across multiple setups (smartphone, implement mounted camera, drone imagery, video implementation, etc.). While this model is derived from only one growing season, it is compiled from 13 sites and multiple blocks per location. Across the dataset, the yields ranged from 324 to 5884 kg ha −1 , across many different soil types, wheat varieties, and ranging environmental pressures. is shows that FGCC values have the opportunity to be just as effective as NDVI in predicting yield. While this dataset is limited in time, it is not limited by space across the state. We acknowledge while this model is not robust enough to fit growing seasons drastically different from the 2019-2020 growing season, it does provide an avenue for future refinement.

Conclusion
Sensor-based nutrient recommendations have become popular in the past couple of decades, but instruments capable of capturing normalized difference vegetation index (NDVI) can be costly. Canopeo, a tool developed in Matlab and available for free as an app for most smartphones, provides estimates of fractional green canopy cover (FGCC). We found that FGCC was correlated with NDVI, suggesting that it could provide an alternative to NDVI for in-season yield estimates in winter wheat. When using in the Oklahoma nitrogen fertilization optimization algorithm (NFOA) in place of NDVI [4], we found that yield predictions based on FGCC (NSE � −3.3 , r 2 � 0.26) were nearly as accurate as those based on NDVI (NSE � −4.3, r 2 � 0.12). A model developed using the same methods used to develop the first NFOA was also found to be similar to a model built using the NDVI from this project. is model sets the framework for utilizing FGCC to build N rate recommendations for the future, not just for Oklahoma, but for other areas as well. With cost being a significant barrier of current yield prediction method adoptions (NDVI sensors), a NFOA model built to utilize FGCC allows more producers in the state to have affordable access to precision N management technologies and use them in their production practices.
Data Availability e data are available in the dissertation of Dr. Vaughn Reed and can be found at https://shareok.org/discover.

Conflicts of Interest
e authors know no known potential conflicts of interest.