Classification and Prediction of Bee Honey Indirect Adulteration Using Physiochemical Properties Coupled with K-Means Clustering and Simulated Annealing-Artificial Neural Networks (SA-ANNs)

Faculty of Engineering, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan Faculty of Agriculture, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan Philadelphia University, Faculty of Engineering, P.O. Box 19392, Amman, Jordan Faculty of Agricultural Sciences and Technology, Palastine Technical University, Kadoorie (PTUK), P.O. Box 7, Jaffa Street, Tulkarm, State of Palestine


Introduction
Honey is a natural sweet substance produced by honey bees from secretions and nectars of plants. Honey bees collect, transform, and combine honey with specific substances of their own, then deposit and store it in honey combs to ripen and mature [1]. Honey is has diverse composition, appearance, and sensory conception; it is composed of sugars, mainly fructose and glucose, in addition to other 25 different oligosaccharides. It also contains small amounts of proteins, enzymes, amino acids, minerals, trace elements, vitamins, and polyphenols [2].
Honey is rich in nutraceuticals such as antioxidants, enzymes, flavonoids, and phenolic compounds. It has some important medicinal properties such as antibacterial, anticancer, hepatoprotective, hypoglycemic, antihypertensive, and antioxidant properties [3]. e conversion from nectar to honey is a slow process that begins after the returning flight. In the colony, the water content is reduced to 16-20% and then bee workers add the enzymes invertase and glucose oxidase to nectar. Invertase enzyme converts sucrose into the two six-carbon sugars, namely, glucose and fructose, while glucose-oxidase enzyme converts less sucrose into hydrogen peroxide and gluconic acid. ese enzymes are added by bee workers to form the typical sugar composition of honey [4].
Adulteration of honey involves addition of inexpensive sweeteners such as high fructose corn syrups (HFCSs), sucrose syrups, high fructose inulin syrups, or invert syrups. Standard adulteration detection methods such as direct sugar analysis by HPLC or GS-MS may not readily detect adulteration since constituents of the major natural honey components and adulterants would normally have similar physical properties since sugars can be artificially formulated to closely resemble that of pure honey [5]. Adulteration is done either directly or indirectly. Direct adulteration involves addition of various commercial sugar syrups to pure honey [6]. Several studies reported the use of sugar in honey production and its effect on sugar profile, phytochemicals, mineral content, and viscosity. Ribeiro et al. [7] reported that direct addition of high fructose corn syrup to honey has affected its chemical and physical properties such as color, pH, water activity, and moisture content and ash contents. Yilmaz et al. [8] reported that honey adulteration by sucrose and fructose syrups at various concentrations affected the rheological, physical, and chemical properties. White [9] reported the use of various carbohydrates constituents in honey to detect honey adulteration.
Oroian et al. [10] studied honey adulteration with fructose, glucose, and hydrolyzed inulin syrup and reported that it influenced some physicochemical properties such as pH, electrical conductivity, and water activity. Guler et al. [11] investigated changes in viscosity for adulterated honey and reported an increase in viscosity with sugar syrup concentration increase. Several methods were used to evaluate direct adulteration in honey. Kelly et al. [12] reported the use of near infrared transflectance spectroscopy to detect Irish honey adulteration by high fructose corn syrup and beet invert syrup. Gallardo-Velázquez et al. [13] investigated the use of mid-infrared Fourier transform spectroscopy to quantify the content of honey adulterants including HFCS, corn syrup, and inverted sugar. Ruiz-Matute et al. [6] reported the use of GC-MS for detection of honey adulteration with high fructose Inulin syrups. Liquid chromatography (LC) and gas chromatography (GC) have been used simultaneously to detect exogenous sugars in honey by appropriate fingerprints of adulteration [5]. Kumaravelu and Gopal [14] reported the use of near infrared spectroscopy and partial least square regression for detection and quantification of four honey types adulteration by jaggary. Siddiqui et al. [15] provided a comprehensive review of honey adulteration techniques for the period between 2000 and 2016. ey reported that NMR spectroscopy was a powerful methodology for honey authentication and adulteration by various sugars. Indirect adulteration of honey involves feeding honey bees with different sugar solutions at certain stages when natural nectars are not available or for developing colonies with optimal population in time of nectar flows, building up colonies after exposure to pesticide, and increasing colony populations during autumn and spring division [11,16]. Unlike direct adulteration, indirect adulteration of honey which involves feeding honey bees with commercial sugars is extremely difficult to detect. Few studies on indirect honey adulteration detection have been reported. Cavrar et al. [17] found that random feeding of sucrose syrup changed the moisture content and sugar profile and reduced phenols and antioxidant contents of honey. Cordella et al. [18] investigated the use of high performance anion exchange chromatography with pulsed amperometric detection (HPAEC-PAD) combined with chemometric techniques for detection of indirect honey adulteration. Honey samples form French beekeepers containing between 10% and 40% of different industrial sugar syrups were used for the feeding of honey bees. ey found linear discrimination and canonical analysis were useful to classify adulterated honey with 96.5% accuracy. Guler et al. [19] investigated the use of carbon isotope ratios. ey investigated 100 samples of unadulterated honey and honey produced by bees fed with various amounts of sugar syrups at 5, 20, and 100 litres/colony. Syrups included sucrose syrups (SS), glucose syrups (GMS), HFC-85%, HFC-55%, and bee-feeding syrups (BFS). ey were able to detect adulteration in honey fed with 20 and 100 litres/colony of HFC-85 and 100 litres/colony of HFC-55 unlike those fed with syrups at 5 litres/colony. ey reported that internal standards for the detection of carbon isotope ratios and the official methods [20][21][22] were not effective in adulteration detection of honey obtained by feeding bees syrups made from C3 plants such as wheat (Triticum vulgare) and sugar beet (Beta vulgaris). Bertelli et al. [23] reported an effective detection method for honey adulterated using sugar syrups. It involves one and two-dimensional (1D) and (2D) nuclear magnetic resonance (NMR) and multivariate statistical analyses. e study used 63 honey samples from various botanical sources and 7 different sugar syrups. ey analyzed 63 samples of honey from colonies fed with seven different sugar syrups and 63 unadulterated honey samples. e best model for classification involved 1D spectra and a cross-verification analysis with a prediction capability of 95.2%. e 2D-NMR analyses gave less satisfactory results with cross-verification of 90.5% predictability.
e problem needs further investigation by evaluating the effect of feeding bees at different sugar concentrations and evaluating the resulting physiochemical properties of the honey. It is also necessary to develop a new reliable and cost-effective method for detecting indirect adulteration in honey.
erefore, the objective of this study is to use kmeans clustering algorithm and ANNs to classify and predict the levels of indirect honey adulteration based on physiochemical parameters including sugar profile, color, pH, and acidity.

Sample Collection.
Colonies with two-aged queen bees and honey bee subspecies were used in this study. Adult bee's frames were covered with brood (frame occupied with the eggs). Foundation comb made of beeswax with a raised pattern of cell outline, drugs, transport, and control procedures have been standardized. Leak-proof containers were used to cover colonies, with an adequate surface area used to supply sugar syrup outside the hive. Rocks and pieces of wood were placed on container where bees can stand when imbibing these materials. e syrup was prepared in the proportion of 1 kg of granulated sucrose in 100 L of water. Syrup was prepared using hot water without boiling with regular stirring to remove air bubbles and dissolve sugar crystals. e mixture was clear with pale straw color. e sugar syrup was stored in suitable clean plastic drums. For bee feeding, a jar was placed on a special feeding frame at the entrance of the colony. ose containers are often referred to as Boardman feeders. ey were refilled daily when they got empty. No veterinary drugs were used for any honey bee disease. Honey was harvested and centrifuged, filtered with a sieve, and then collected in glass jars. Honey samples were taken from 7 colonies located in a farm at Ajloun city, Northern Jordan. Honey samples were collected from colonies with different feeding concentrations placed in the same area but with different distances from each other to ensure that they were fed with the same type of normal feeding (nectar). Two types of honey were collected: pure honey where colony was not sugar fed and was allowed to be fed completely on natural flowering and sugar-fed honey where colonies were fed sucrose syrup (1 : 1 ratio of sucrose/ water) with the following different amounts: 10, 20, 40, 60, 80, and 100 L once every 3 days.

Sugar Profile Analysis
Using HPLC. Analysis of honey sugars was conducted using AOAC [24] with minor modifications. A 10 μL portion of each prepared sample was injected to HPLC equipped with RI-detection (Shimadzu refractive index, RID-10A). A separation column (Shimpack SCR-101N, 250 mm L × 4.6 mm I.D., 10 μm) was used. e column temperature was held at 30°C. e mobile phase was a mixture of water/acetonitrile (80 : 20 v/v). e flow rate was 1.3 mL/min. Sugars were identified according to their retention times by comparing with appropriate sugar standards. Quantitation was performed according to the external standard method on peak areas or peak heights.

Moisture Determination.
Moisture content was determined using the indirect refraction metric method. All measurements were taken using an Abbe refractometer, and the percentage of moisture was obtained from the refractive index of the honey sample by reference to the Wedmore conversion table [25]. Moisture content of honey was reported to be contributing to honey stability against fermentation and granulation during storage [26].

Acidity and pH.
e pH and free acidity were determined according to the harmonized methods of the International Honey Commission [25]. e pH of a solution was obtained by dissolving 10 g of honey in 75 ml CO2-free distilled water, and the pH of the solution was measured using a pH meter (CyberScan pH510 -Eutech Instruments). e free acidity was measured by the titration of the solution (10 g honey dissolved in 75 ml of CO 2 -free distilled water) with 0.1 M NaOH to pH 8.3; the results were expressed in milliequivalent per kilogram.

Color Measurement.
Honey color was measured by colorimeter (12 MM Aperture U 59730 Inc., Pittsford, New York, USA) and recorded using the L * , a * , and b * color system according to [27]. e colorimeter was calibrated by a standard white ceramic reference (Commission Internationalale de I`Eclairage L * � 97.91, a * � −0.68, and b * � +2.45). In addition, total color difference (ΔE) and chroma were calculated using the following equations: ree replicates were obtained for all measurements (except for HPLC with 2 replicates).

Modeling and Statistical Analysis
2.6.1. Using General Linear Model (GLM). Data were analyzed using the general linear model (GLM) procedure with JMP statistical package (JMP Institute Inc., Cary, NC, USA). Means were separated by LSD analysis at a least significant difference of p ≤ 0.05 values.

Using K-Means Clustering for Classification of Honey
Level Adulteration. In order to classify the levels of indirect honey adulteration, the k-means clustering algorithm was used.
e technique is a nonhierarchical, unsupervised clustering method used to classify cases into categories called clusters which are homogeneous within themselves and heterogeneous among each other.
is is usually achieved by using Euclidian distance or other criteria for clustering data. e k-means clustering library in SPSS 18 (SPSS institute, North Carolina, USA) was used for this purpose. e first step involves specifying the number of clusters (k), where 7 categories were used to cover the different levels of honey adulteration (0 to 100%). Next, the initial values of aggregation centers called k "seeds" are estimated. e Euclidian distance (the mean squared error of individual observations from cluster points) is then used to assign all similar units to the closest cluster seed. e procedure is repeated several times as necessary until no Journal of Food Quality 3 better reclassification is possible.
e sugar profile of adulterated honey samples (fructose, glucose, sucrose, and maltose content) and other physiochemical properties including pH, color, and water content were used as input variables for cluster classification [28,29].

Using SA-ANNs to Predict Honey Adulteration Level.
In addition to classification by k-means clustering, a hybrid simulated annealing coupled with artificial neural network algorithm (SA-ANNs) was used to predict the level of honey adulteration from 0 to 100%.
ere are two reasons for coupling simulated annealing with ANNs. SA is usually used to provide a global solution for the ANN and to avoid falling to a local minimum solution during the optimization process. Secondly, SA is used to initiate neuron weights and to select ANN architecture automatically. erefore, using SA-ANN hybrid algorithm can substantially facilitate the development of a prediction model for honey adulteration percentage [30,31].

HPLC-RID Sugar Profile.
e feeding effect of different sugar proportions to honey bees on glucose, fructose, and sucrose content is shown in Table 1. e glucose and fructose content decreased significantly from 33.4 to 29.06% and from 45.2 to 35.9%, respectively, as the amount of sucrose syrup increased in the feed. e sum of glucose and fructose contents was higher than the standard value for all treatments (not less than 60 g/100 g) as reported by Codex Alimentarius [1] and not less than 65 g/100 g according to the Jordanian standard. e sucrose content on the other hand increased significantly from 0.19 to 1.80% as sucrose syrup percentage increased in the feed. Fructose content is observed to be more sensitive to sucrose adulteration since the difference between control and 10% sucrose adulteration was more evident (45.2 and 39.8%, respectively). e high contents of glucose and fructose in sucrose-fed honey were explained by Guler et al. [32] who reported that 95% of the sucrose given to bees was converted to glucose and fructose by the invertase enzyme responsible for the breakdown of sucrose and secreted by worker bees from hypopharyngeal glands [33]. Guler et al. [11] reported similar results in honey fed with 5, 20, and 100% sucrose syrup. ey reported that glucose content increased with 20% feeding but decreased with 100% feeding of sucrose syrup. Additionally, they reported an increase in sucrose content and a decrease of fructose content. Cavrar et al. [17] studied the properties of pure and sucrose-adulterated honey samples with one concentration of water to sucrose at a ratio of 1 : 1.5 (w/w) to each colony. ey reported higher fructose and glucose contents and a lower sucrose content in control samples compared to those adulterated with sucrose syrup. ey similarly reported that worker bees use invertase enzyme to convert the majority of sucrose to invert glucose and fructose. Anklam [34] found that the actual proportion of fructose to glucose in any particular honey depends largely on the source of the nectar. e fructose to glucose ratio (F/G) is shown in Table 1. e results show that control had significantly higher value of F/G (1.36), compared to honey adulterated with sucrose at all percentages which varied from 1.18 to 1.23. Tosi et al. [35] reported that F/G ratio of 1.14 or less indicates fast granulation, while values greater than 1.58 are associated with no tendency to granulation. It can be concluded from these results that adulterated honey samples have more tendency to granulate. Similar studies reported F/G ratios of honeys to be 1. 19 [11]. On the contrary, pH value increased significantly among all treatments from 3.04 to 4.63. e highest value was found in Trt 7 (4.63) while the lowest was found in Trt 1 (3.04).Özcan et al. [39] found that sugar feeding increased pH value, which agrees with the present study. Similarly, Ribeiro et al. [7] observed a similar effect by feeding honey bees with fructose syrup. Acidity decreased slightly but not significantly with increase in sucrose feeding percentage and varied between 7.0 and 4.00 meq/kg for Trt 1 and Trt 7, respectively. All values were well within the standard (maximum of 50 meq/kg) reported by Codex Alimentarius [11]. Similarly, Guler et al. [11] found that acidity ranged from 8 to 16.9 meq/kg in honey fed with 5, 20, and 100% sucrose syrup. Gebremariam and Brhane [40] explained this by the fact that sugar feeding caused a reduction in the dissociated organic acid content particularly the gluconic acid, which is a byproduct of glucose oxidation by glucose oxidase, and inorganic ions such as phosphate and chloride.

Color Measurement.
e feeding effect of different syrup concentrations on honey color is shown in Table 3. e results were expressed as L * for darkness/lightness (0 black, 100 white), a * (−a greenness, +a redness), and b * (−b blueness, +b yellowness). e results show some differences among different samples fed with different sugar concentrations. Honey's L values increased from 59.3 to 68.84 with a low L value expressing darker samples. Lightness is observed to increase as syrup concentration of feeding increases. Random variations in a * and b * are also observed (−4.3 to 1.16 and 24.79 to 48.04, respectively). ΔE and chroma values varied also randomly and ranged from 68.96 to 78.45 and 25.17 to 48.04, respectively. Kolayli et al. [38] found similar results when feeding honey bees with different types of syrups in random. ey reported darker color for pure honey compared to honey fed with sucrose syrup. ey further explained the darker honey to be a result of the flora involved, the associated vitamins, pigments, phenolic substances, mineral content, chlorophyll, caroten, and xanthophyll's compounds.

K-Means Clustering.
e classification of the seven different honey adulteration levels using sugar profile and pH using k-means clustering is shown in Table 4. e table shows the percent correct classification of honey adulteration level using different sugar types in addition to pH values. Other physiochemical properties including moisture content and acidity were not found useful for honey adulteration classification. e level of classification significance is demonstrated by both F-statistics and p value.
e results show clearly that glucose and total sugars provided the best classification results with 100% correct classification of adulteration level followed by fructose and sucrose content with 95% classification accuracy and finally the pH value which gave the least classification accuracy of 52% accuracy.
is suggests that both glucose and total sugars can be used separately to detect honey adulteration level accurately. is result suggests that a cost-effective and easy method based on total sugar content can be used to detect indirect honey adulteration without the need for obtaining sugar profile analysis. Table 5 shows the distance between final seven cluster centers of the classification matrix for glucose. e larger distances between cluster centers indicate better classification. e distances varied between 0.497 for adulteration levels 0% and 80% and 2.783 for adulteration levels 0 and 20%. e results support earlier  All values are means of three replicates and calculated on wet basis. * Means ± SD in the same column with the same letter are not significantly different (p ≤ 0.05).  [29] reported using k-means clustering to classify rice productivity in Indonesian provinces into three clusters successfully. Leemans and Destain [41] reported using a k-means hierarchial grading algorithm to detect the defects in Jonagold apples. ey reported a 91% correct classification from the accepted fruit. Bairam and Green [42] used color images with k-meansbased clustering to detect cracks in water melon. Melons were segmented, and their cracked parts were identified with k-means clustering algorithm. e results showed that the method was effective on melon's cracking identification. Noviyanto and Abdulla [43] reported the use of similar classification algorithm called the k nearest neighbor (kNN) clustering to classify honey botanical origin with around 83% accuracy and 2.6% standard deviation. Cordella et al. [18] investigated indirect honey adulteration from 10 to 40% using several bee-feeding sugar syrups. ey reported that using linear discriminant analysis (LDA) coupled with canonical analysis to classify honey adulteration resulted in high classification efficiency of 96.5%. Oroian and Ropciuc [44] reported that the use of linear discriminant analysis (LDA) with phenolic compounds and physicochemical parameters resulted in good classification of honey samples (92% correct) based on their botanical origin. ANN). Artificial neural networks are powerful tools used to predict complex behavior of input-output data. ey have the advantage of being able to model any complex system if adequate data are available for network training.

Simulated Annealing-Artificial Neural Network (SA-
One difficulty arises in developing ANNs which involves the determination of initial weights used in the network topology. erefore, a simulated annealing (SA) algorithm is used to optimize the initial weights used in building ANN. ANNs use the sum of square error function (SSE) with a backpropagation algorithm (BP) to adjust the neuron weights and the loop is repeated several times until the prespecified SSE is reached. Detailed description of MLP and RBF ANNs can be found in Al-Mahasneh et al. [27]. In this study, sugar profile (glucose, fructose, sucrose, maltose, and total sugar content) was used as input parameters to predict honey percent adulteration as a dependent variable. Two commonly used ANN types are multilayer perceptron (MLP) and radial basis function (RBF). Data were partitioned into 70% training used to train the network and 30% used to validate the resulting model. is means that data were randomly allocated to training and validation parts in order to provide a valid model structure and avoid overfitting of data.
is is normally used to assure that the model obtained is useful to predict new unseen data points. e results of both types are shown in Table 6. Additionally, the RBF-ANN structure is shown Figure 1, and the plot of predicted versus observed honey adulteration percent is shown in Figure 2. e RBF-ANN was shown due to the better results compared to MLP-ANN. e results showed a high prediction capability of honey percent adulteration using ANNs. RBF-ANN with 10 nodes and softmax activation function provided slightly better prediction results compared to the MLP-ANN. is can be observed by lower SSE (0.096 and 0.073) and RE (0.027 and 0.021) and higher overall R 2 (0.981 and 0.992, respectively). e results obtained for validation error SSE and validation coefficient of determination R 2 were 0.073 and 0.99, respectively. e results indicated that the ANN model developed was robust and able to predict new   [44] reported the use of ANNs for classification of honey origin based on physicochemical parameters and phenolic compounds. ey concluded that a multilayer ANN with 2 hidden layers was able to classify honey botanical origin with 95% accuracy. Al-Mahasneh et al. [27] reported using ANNs for successful prediction of wild flower honey viscosity using the combined effect of temperature, shear rate, and water content of honey. Cordella et al. [18] reported using partial least squares model in linear regression to successfully predict the adulteration percentages of new honey samples that were adulterated by feeding bees with different industrial sugar syrups. Oroian and Ropciuc [44] used a 2 hiddenlayer MLP ANN to successfully classify honey samples (94.8% accurate) on the basis of botanical origin.

Conclusions
e effect of honey indirect adulteration which involves feeding honey bees with sucrose syrup was evaluated using physiochemical properties including sugar profile, moisture, acidity, pH, and color. e glucose and fructose content decreased significantly with increase in percentage of adulteration. On the other hand, sucrose content, pH value, and lightness (L) increased significantly with percent adulteration. K-means clustering was effective in classifying honey adulteration percentage using glucose and total sugar content. Simulated annealing (SA) coupled with radial basis artificial neural networks (RBF-ANNs) was able to predict adulteration percentage with high accuracy. It is concluded that indirect honey adulteration can be effectively detected using K-means clustering algorithm based on glucose content or total sugar content in honey which can be a noncostly and easy measurement method.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.