Simultaneously Verifying the Original Region of Green and Roasted Coffee Beans by Stable Isotopes and Elements Combined with Random Forest

Simultaneously verifying the original region of green and roasted cofee beans is very important for protecting legal interests of the stakeholder according to the chemical analyzing method. 131 green cofee bean samples are collected from six diferent original regions and pretreated with three degrees (green, middle, and dark roasted); fve stable isotope ratios ( δ 13 C, δ 14 N, δ 18 O, δ 2 H, and δ 32 S) and twelve elemental contents (Al, Cr, Ni, Zn, Ba, Cu, Na, Mn, Fe, Ca, K, and Mg) of green, middle, and dark roasted cofee bean samples (131 × 3) were analyzed. Fractionation of stable isotopes and variation of elemental contents were evaluated, only isotope hydrogen ( 2 H) signifcantly fractionated, and elemental concentrations increased with a certain rate during the roasting process. One-way analysis of variance (ANOVA) was used to compare the stable isotope ratios and elemental concentrations of all cofee bean samples from six diferent original regions. Random forest (RF) was employed to build a discriminating model for simultaneously verifying the original regions of green and roasted cofee bean samples; this model provided 100% accuracy. Inclusion of this mathematical model for simultaneously verifying the original region of green and roasted cofee beans had powerful distinguishing capability and which will not be infuenced by fractionation of hydrogen ( 2 H) and variation of element contents during the roasted process.


Introduction
In the past decades, China's economy has made a great progress, also the cofee industry shows an overall upward trend, China's cofee bean imports exceeded 100,000 tons for the frst time in 2021, reaching 122,800 tons, most of them are imported from cofee-producing tropical countries, and specialty cofee increased sharply [1]. Chinese government carried out the "Internet plus plans" projects in those years; numerous trading companies/groups were founded, and many Internet celebrities appeared; those "Internet plus commercial companies" projects make cofee consumption become convenient and efcient; meanwhile, the cofee culture was promoted to spread quickly; more and more Chinese consumers have developed preferences for cofees that have unique favor characteristic or are less abundant in the open market [2]. As we know that favor, taste, aroma, and style of cofee vary signifcantly by the geographical origin of cofee beans, cofee beans from the most coveted regions of the world can command prices many times higher than the average global price and consumers have their own purchasing preferences of geographical cofee beans; moreover, consumers are paying increasing attention to cofee beans' quality and safety [3,4]. Due to the above reasons, the mislabeling of geographical origin of cofee beans has become another area of fraud. However, Chinese concerning government sectors (customs and administration for market regulation) have limited ofcial technical methods available to check the mislabeled or fake geographical origin of cofee beans. Right now, inspection protocols for imported cofee beans are mainly based on the two national standards (GB/T19181-2018 and GB/T18007-2011) [5], both of them have no technical method about the tracing original regions of cofee beans. So, it is highly valuable to develop a stable, inexpensive, and reliable method to identify the geographical origin of cofee beans for protecting legitimate interests of many stakeholders of the cofee industry, like consumers, cofee farmers, traders, importers, and distributors. Ti sis special for concerning government sectors, which need reliable technical tools that are used to ensure the sutainable and healthy development of the cofee industry. Normally, consumers always purchased roasted cofee beans, so all of them more cared about original regions of roasted cofee beans. Distributors and cofee farmers traded with bags of green cofee beans, who more cared about original regions of green cofee beans. Because government sectors regulated the cofee industry, which cared about geographical origins of green and roasted cofee beans. Based on the above situations, the reliable technical tools which can simultaneouslyclassify the original region of green androasted cofee beans should be developed to meet all stakeholders' request.
Up to now, various analytical techniques [6][7][8][9][10][11] have been used to authenticate the geographical origin of cofee beans, such as terahertz (THz), near infrared (NIR), nuclear magnetic resonance (NMR), inductively coupled plasma mass spectrometry (ICP-MS), gas chromatographic time-offight mass spectrometric methodology (GC-TOF-MS), and isotope ratio mass spectrometry (IRMS). However, considering the stability, accurate, dependability, and cost of all those methods, stable isotopes and trace elements analysis are considered to be the most economical and efective to identify original regions of green cofee beans. Several studies [12][13][14][15] also reported that stable isotope ratios and elemental contents are great indicators of original regions, meanwhile, combined with multivariate analyzed methods, which would be powerful technical tools for tracing the original regions of food resources and agricultural products, including cofee beans, tea, and wine.
Multielements and stable isotope profles of cofee bean samples from four original regions of Ethiopia were analyzed by ICP, XRF (X-ray fuorescence spectrometry), and IRMS and then coupled with linear discriminant analysis, resulting in 80-89% of successful classifcation [16]. Meanwhile, Rodrigues et al. [11] used stable isotope ratios (δ 13 C, δ 18 O, δ 14 N, δ 34 S, and δ 87 Sr) and 30 elements of green cofee samples combined with multivariate statistic method; fve Hawaii subregions (Hawaii, Kauai, Maui, Molokai, and Oahu) were verifed with 100% accuracy. Stable isotopes and elements are used to determine the geographical origins of cofee beans [17], because both of them will not degrade during storage [18] which can be indictors of the soil and ecological environment type. However, cofee beans will lose 14-20% of its mass during the roasting process, including water loss, absolute elemental contents arise, and change in carbohydrate, oil, and protein composition [19], which lead to be uncertain that stable isotopes will fractionate or not, whether isotopes fractionation afects the distinguishing capability of the mathematical model. Meanwhile, trace elements would not be volatilized during the roasting process; their contents should be increased, which may overcurtain the information about geographical origin of roasted cofee beans [4]. Previous studies mostly focused on determining the geographical origin of green beans, even few studies that involved roasted cofee beans did not account for roasting-related changes in mineral concentration and stable isotopes [4,6,8,17]. For the above reasons, the roastingrelated changes in mineral contents and isotope ratios should be evaluated in this case.
Te objective of this study was to develop a stable, reliable, and inexpensive approach which can simultaneously verify the original regions of green and roasted cofee beans. Tis technique tool can be applicable for consumers, also for traders, distributors, and other stakeholders, to give an exposure of counterfeited cofee beans in the open market and protect stakeholders of cofee beans from coveted regions.

Samples Collection and Preparation
. 131 green cofee bean samples (C. arabica) were obtained from reliable importers and traders from six diferent cofee growing countries, which are from Columbia, Nicaragua, Panama, Brazil, China, and Rwanda, respectively. All those samples have more specifc location information more than just a country of origin; the information is provided in Table 1 and Figure 1. Te green cofee sample beans (1 kg/time) from Table 1 are put into the roaster for diferent degrees of roasting, the following steps show how to pretreat these samples; for green bean samples, there is no treating; for middle-roasting beans, preheating roaster to 200°C, adding 1 kg green samples, slowly heating the roaster to 210°C in 10-11 mins, make sure the Agtron value was 75; for darkroasting beans, preheating the roaster to 200°C, adding 1 kg green samples, slowly heating the roaster to 223°C in 13-15 mins, make sure the Agtron value was 50.
Te method of sample powder preparation was similar with previous paper's [4], each green and roasted cofee bean samples (100 g) was grounded in a mill (AG204, Mettler Toledo, Switzerland), three times for 5 mins each time, to obtain the particle size of <1 mm to achieve a homogeneous sample. After grinding, samples were dried overnight at 45°C and then 5 mg sample powder was weighed in tin capsules (4 × 6 mm) that were then folded close (P6, Mettler Toledo, Switzerland).

Stable Isotope Ratios Analysis.
Te δ 13 C, δ 14 N, and δ 34 S values of cofee beans (green and roasted) were analyzed [20] by IRMS-Flash HT 2000 (Termo Fisher, Bremen, Germany). Samples and references are placed in an automatic sampler, which were reduced by combustion in a reactor at 1020°C and then converted to CO 2 , N 2 , and SO 2 , respectively. After dehydration by Mg (ClO 4 ) 2 , the samples were separated on a column (70°C) and then entered an IRMS ion source, the carrier gas was helium with a constant fow of 180 mL/min, and reference gas was CO 2 with a constant fow of 50 mL/min. For correcting measurement precision of the isotope ratios,one reference material was incorporated at every test batch (8 samples). δ 13 C, δ 14 N, and δ 34 S data were calibrated against USGS-43 (in house standard, δ 13 C V-PDB = −21.28‰, δ 15 N Air = 8.44‰, and δ 34 S V-CDT = 10.46‰).
Te δ 2 H and δ 18 O values of cofee beans (green and roasted) were analyzed [20] by IRMS-Flash HT 2000 (Termo Fisher, Bremen, Germany). For hydrogen isotope ratio (δ 2 H) analysis, encapsulated samples in a silver capsule were placed in a desiccator together with reference material for longer than 72 h to adjust the atmospheric efect. Samples and reference materials are placed in an automatic sampler, both were pyrolyzed to H 2 and CO at 1300°C and separated through a column with 85°C before entering an IRMS ion source, the carrier gas was helium with a constant fow of 200 mL/min and references gas was CO 2 with a constant fow of 110 mL/min. For correcting measurement precision of the isotope ratios, one reference material was incorporated at every test batch (5 samples); meanwhile, every sample was analyzed 2 times to prevent the memory efect when hydrogen isotope ratio was tested. δ 13 C, δ 14 N, and δ 34 S data were calibrated against USGS-56 (in house standard, δ 2 H = −44.0‰ and δ 18 O = 27.23‰).

Elemental Contents Analysis.
Te standard method was used to measure the elemental content of cofee beans samples [14] and briefy described as follows: 1.0 g cofee powder was added into an acid-cleaned Tefon vessel and sequentially reacted with HNO 3 for complete digestion, and those samples were dried and dissolved in 5% HNO 3 , and then fltered by membranes (0.45 μm). Ca, K, and Mg were analyzed using inductively coupled plasma atomic emission spectroscopy (Optima 8300, PerkinElmer, USA) and the following elements (Al, Cr, Ni, Zn, Ba, Cu, Na, and Fe) using ICP-MS (NexION 1000, PerkinElmer, USA).

Statistical Methods and Data Analysis.
Cofee bean samples for the analysis were 131 × 3 (131 samples from six diferent producing regions; three diferent pretreated ways, green, middle, and dark roasted). Te R-Project Version 3.5.1 software (https://www.r-project.org/) was employed to analyze the stable isotope ratios and mineral concentration data matrix. One-way analysis of variance (ANOVA) was used to analyze the roasted-related changes in isotope ratios and elemental contents and also compare the isotope ratios and mineral concentration from green, middle, and roasted cofee bean samples from six diferent original regions at a 5% signifcance level. Random forest (RF) was employed to evaluate discriminating power of isotope ratios and mineral concentrations, meanwhile, which was applied to build the distinguishing model of geographical origin of green, middle, dark, and mixed cofee bean samples. As a powerful machine learning classifer [21,22], key advantages included nonparametric nature, high classifcation accuracy, and capability to determine variable importance. Over past years, RF has achieved a huge success in modeling high-dimensional datasets for a range of diferent purposes, such as products authentication in food analysis.

Te Fractionation Efect of Stable Multi-Isotopes from
Green to Dark Roasted Cofee Beans. Five stable isotope ratios (δ 14 N, δ 13 C, δ 34 S, δ 18 O, and δ 2 H) of three diferent degree roasted cofee bean samples (green cofee beans, middle, and dark roasted, total 131 × 3 samples) from six diferent original regions (Columbia, Nicaragua, Panama, Brazil, China, and Rwanda) were analyzed by one-way ANOVA, the results showed that only δ 2 H (P < 0.01) had signifcantly changed, that means only hydrogen isotopes had signifcantly fractionated, and other four isotopes had no signifcantly fractionation during the roasting process ( Table 2).
Isotope hydrogen ( 2 H) has the largest range of stable abundance variation in nature, which is up to 250%; meanwhile, isotopes fractionating process was strongly affected by the temperature [23]. Te fractionation of hydrogen ( 2 H) might be related to the evaporation water of cofee samples during the roasting process; there is dramatically changing of temperature (almost up to 200°C). Several authors [24][25][26] also proved that the hydrogen ( 2 H) had signifcantly fractionated during processing of food products (tomato, noodles, and tea), which was also related to the temperature and water changing.
For isotope oxygen ( 18 O), there is relatively smaller diference between 18 O and 16 O in mass, so the fraction efect of oxygen will be far less than hydrogen in evaporation [23]. Our work showed that isotope oxygen ( 18 O) had no signifcantly fractionation during the roasting process. Previous studies [24,27] showed that there is no obvious oxygen fractionation of beef and tea during the roasting process. Tose above studies which can be applied to interpret that isotope hydrogen had signifcantly fractionated and isotope oxygen had not signifcantly fractionated during the roasting process of cofee bean samples.
As we know that intense chemical reactions should be fnished during the roasting process of cofee beans, just like the Maillard reaction, which is between amino acids and reducing sugars, this will give cofee beans browned, distinctive favor, and aroma [19], but there is no obviously carbon isotope fraction during the roasting process of cofee beans. Probably those reactions just only concerned about the chemical bond (-C-OH and H-C-), not concerned about the carbon-carbon bond (-C-C-) breakdown [23]. Bostic and Guo [27,28] also indicated that there is no signifcantly changing of isotope carbon ( 13 C) in yeast buns, sweet cookies, and roasted beef; all of them were related to Maillard reaction during the roasting process. Chemical conversion, physical transport, nitrogen cycle, and sulfate fx may cause stable isotopes fractionation, special in mineralization, and nitrifcation process [23,28]. Table 2 shows that there is no signifcant isotope enrichment ( 15 N and 34 S) and ( 14 N and 32 S) depletion during the roasting process of cofee beans, which demonstrated that isotopes ( 15 N and 34 S) are relatively stable in roasting or baking process of food materials. Several studies [24,27,29] also presented isotopes ( 15 N and 34 S) of roasting beef, cofee beans, and tea that had no signifcant fractionation during the process, which means both nitrogen and sulfur isotopes are stable in the process.
Tere is no signifcant fractionation efect during the roasting process except isotope hydrogen in our work, which kept the consistency with the previous studies. Teδ 18 O and δ 2 H values of plant was not only related to the latitude of the plant growing regions and also the local precipitations. Te δ 13 C values depend on the plant species (C-3 or C-4 plants), the δ 14 N values were signifcantly afected by the chemically synthesized fertilizers, and the δ 34 S is mainly from soil sulfate and atmospheric SO 2 , which can be interpreted that isotopes can be efective contributors for determining geographical origin of food resources. Several authors [30][31][32] have proved that stable isotope ratios were excellent indicators for geographical origin of agricultural and food products, including grape wine, olive oil, onions, cofee beans, tea, and beef (fresh and roasted).

Te Variation of the Multielements from Green to Dark
Roasted Cofee Beans. Cofee beans were agricultural products; the multielements of cofee beans are mainly from the soil of cofee plantation, which are linked to local soil composition, so elements can be used to efectively discriminate cofees of the growing area [33]. Furthermore, elements will not be volatilized and be degraded, concentrations will be changed with mass loss of cofee beans during the roasting process, numerous studies [6,9,11,29] used concentration ratios of elements to identify the geographical origin of cofee beans, which also demonstrated that many of the world's cofeeproducing regions can be distinguished from other regions of the world on the basis of elements' ratios. However, the variation of elemental contents may obscure geographical information of cofee beans [19]. For this reason, we analyzed the contents of twelve minerals (Al, Cr, Ni, Zn, Ba, Cu, Na, Fe, Mn, Ca, K, and Mg) of all cofee bean samples from 6 countries (131×3) by ICP-OES and ICP-MS and then used the average element contents of green beans divided into average contents of green beans, middle, and dark roasted cofee beans, respectively.. Table 3 shows the element ratios (C Gb /C Gb , C Mb / C Gb and C Db /C Gb , where C means content, Gb means green beans, Mb means middle roasted beans, then Db means dark roasted beans). Firstly, we can see that ratios of C Gb /C Gb are 1; secondly, ratio of C Mb /C Gb is 1.17, except elements (Al and Fe) is 1.18 from Columbia, and element (K) is 1.18 from Brazil, which is little bigger than others; thirdly, ratio of C Db /C Gb is 1.21 except Al is 1.20 from Nicaragua, and Mn is 1.58 from China, which is much bigger than the other ratios, probably due to some abnormal samples or measurement errors. According to those ratios, we can see that the elemental contents were the lowest in green beans and elevated in roasted beans, and trace element concentrations increased, the ratios of element concentrations still stay relatively constant, and the efects also were demonstrated by Van Cuong et al. [33]. Meanwhile, Belitz et al. [19] summarized that cofee beans lose 14-20% of its mass during the roasting process of cofee beans, which would push increasing of mineral contents of roasted cofee beans. Consequently, the variations of mineral contents of cofee beans are kept relatively constant; whether it would afect the accuracy of the regional origin classifcation model will be studied further.

Determining Origins of Green, Middle, and Roasted Cofee
Beans. In this case, 131 samples (C. arabica) were collected from six countries: Columbia (Valle del Cauca), Nicaragua (Matagalpa), Panama (Volcan-candela), Brazil (Sul de Minas), China (Baoshan city), and Rwanda (Nyamagabe). Considering the species of cofee may infuence the stable isotope ratio and elemental contents, only C. arabica was employed in this work. Columbia [34] was the third biggest countries of producing cofee beans, which provided high quality beans for consumers; special Valle del Cauca, which is located in the Andes Mountains, has high altitude (1450-2000 m) and fertile soil, which provides perfect condition for cofee tree, its cofee has not only amazing acid and elegant sweetness but also brings complicated foral favor, good balance, silky mouthfeel, and long after-taste, which has a huge number of fans in this world. Cofee industry [35] was the important economic support for Nicaragua; they mainly planted the traditional varieties (Caturra and Bourbon), provided high quality and diversifed taste cofee; special Matagalpa mainly produced specialty cofee. All of the six regions are wonderful growing area of cofee, which provided excellent beans for consumers and buyer from diferent countries and always attracted the worldwide attention of cofee consumers [36]. Te stable isotope ratios and mineral compositions of cofee beans were infuenced not only by the agronomic practices and climate but also by the altitude, soil, and water of the growing area [12,29]. To build reliable and stable discrimination model of geographical origin of green cofee beans, ANOVA (one-way) and RF (Random Forest) were conducted in this part. ANOVA is always used to analyze the diferences among groups of data [14]. RF was an ensemble learning method with high classifcation accuracy, which was widely applied to classify the original region of food products, and it requires larger scale number of data matrix [21,22]. Te fve isotope ratios and twelve elemental contents were combined together and analyzed by one-way ANOVA, the results showed that there is signifcantly a diference among original regions at 95% confdence level (Table 4), which means those parameters are efective for identifying geographical origin of green cofee bean samples. Random forest was carried out, the dataset was randomly divided into two groups, training group (70%), and test group (30%), the training group was built to the discrimination model, test group was to assess the accuracy of this model, and the results are shown in Table 5. Te results indicated that seven green cofee bean samples from Columbia are totally classifed into Columbia, and the other samples from left fve original regions are the same situation, that means this model has 100% accuracy and powerful capability to classify the geographical origin of cofee bean samples from diferent regions, obviously combined isotope and element with random forest are efective approach to verify original region of cofee beans. Meanwhile, the same way was applied to verify the producing region of middle and dark roasted cofee bean samples; the results are the same with green cofee bean samples; both of them have the signifcant diference among original regions at 95% confdence level (Tables 6and 7); meanwhile, classifying original regions with 100% accuracy (Tables 8 and 9) also conveyed that the fractionation efect of hydrogen during the roasting process did not afect the stability of the discriminating model of the original region for the corresponding roasting degree samples; meanwhile, the variation of element contents did not obscure the discriminating capability of this model. In other words, this result of this part provided sufcient support to establish a discriminating model of original regions whether the samples are green or roasted cofee beans.
Tis work and previous studies [9,11,17,29] demonstrated again that stable isotope ratios and elements are excellent indicators of geographical origins of cofee beans, including other agricultural products, which was only associated with growing area, not infuenced by the roasting process. Furthermore, taking stable isotopes and elements as indicators, coupled with chemometrics, it can be used to classify the geographical regions of green and roasted cofee bean samples at the same time.

Valle Del Cauca Minas Gerais Matagalpa
Yunnan province Volcan-candela Nyamagabe   Journal of Food Quality Journal of Food Quality     Figure 2 show the results; the accuracy of this model was 100%; and all the samples are correctly identifed into their own producing regions. Causing the great prediction ability of this method, the contributions of those parameters from the four models of cofee beans were analyzed (Table 11); we can see that the biggest contributor group is made of six parameters     Columbia  17  0  0  0  0  0  100  Nicaragua  0  14  0  0  0  0  100  Panama  0  0  14  0  0  0  100  Brazil  0  0  0  18  0  0  100  China  0  0  0  0  19  0  100  Rwanda  0  0  0  0  0  14  100 Testing (30%) Journal of Food Quality 9

Journal of Food Quality
(Na, δ 2 H, Al, Ni, Mn, and Cr), the accumulative contributions are over 59%, hydrogen devoted 9.67%, 9.31%, 8.65%, and 9.69%, respectively. Te stability of the discriminating model was not afected by the fractionation of hydrogen. Probably, the high accuracy of this method in this work was due to the remote distance among the original regions, which lead to the huge diferences in growth environments of sample beans.  Columbia  63  0  0  0  0  0  100  Nicaragua  0  45  0  0  0  0  100  Panama  0  0  54  0  0  0  100  Brazil  0  0  0  54  0  0  100  China  0  0  0  0  63  0  Te mathematical model with 100% accuracy was frst adapted in this study, which can simultaneously identify geographical origin of green, roasted, or mixed cofee beans; Figure 3 shows the procedures. Our work reveals that the fractionation of hydrogen and variation of elements will not infuence the prediction ability of discriminating model of the original region of cofee beans that means combining stable isotopes and elements was a reliable method to trace the geographical origin of green and roasted cofee beans. Tis work was based on the isotope ratios and multielements, which can produce the promising results for original regions with diferent types of soil and climate; however, this approach may fail to identify the growing area which was adjacent countries with similar climatic environments. As we know that verifying the geographical origin of cofee beans was challenging, because the chemical profles of beans was infuencednot only by species and growing area, but also by the postharvest processing, storage conditions, and roasting procedures [3]. Based on the phenolic and methylxanthine profles of cofee beans, Alonso-Salces et al. [37] misclassifed 30-60% of the Cameroonian sample beans of original regions, which showed that phenolic profles were not reliable indicators, which were not stable and can be easily infuenced by species, toasting process, and other factors. Te aroma and volatile compounds of cofee beans were analyzed by PTR-ToF-MS (proton transfer reaction time-of-fight mass spectrometer); the signifcant diferences were founded [10,38], which was a rapid and direct technique; however, the cost of this method is much more expensive than our works. Serra et al. [39] identifed the original region of green cofee beans with 88% successful classifcation by carbon, nitrogen, and boron isotopes, and Anderson and Smith  [40] tried to classify the geographical origin of cofee beans according to 11 elements; the results showed 70-85% accuracy; that means the accuracy of the two methods needs to be improved. Meanwhile, the NMR was also applied to trace the geographical origin of cofee bean samples [8,41], which was a powerful technique, however, which need a complicated extraction process and its cost was expensive. Due to the cost and efciency, the stable isotope ratios (IRMS) and multielements (ICP) gained more and more interest for researchers in tracing geographical regions of agricultural products.

Conclusions
Five stable isotope ratios (δ 14 N, δ 13 C, δ 34 S, δ 18 O, and δ 2 H) and twelve elemental contents (Al, Cr, Ni, Zn, Ba, Cu, Na, Mn, Fe, Ca, K, and Mg) are excellent indicators of growing area of cofee beans, and random forest was a powerful tool for predicating the geographical origin of cofee beans. In our work, the mathematical model showed great ability to predicate geographical origin of green, roasted, or mixed cofee beans at the same time; their accuracy was up to 100%. Meanwhile, the stability of this model will not be changed by the fractionation efect of isotope hydrogen and variation of element contents during roasting processing of cofee beans. However, this scale of the growing area of cofee bean samples was too big, which was only used to verify the producing area of cofee beans from diferent countries. So, smaller scale of growing area or subregion of cofee bean samples should be collected for further study in the next step, just for classifying the local smaller growing area of cofee beans. It can be concluded that more works are necessary to protect economic benefts of stakeholders in the cofee industry.

Data Availability
Te data used in this study are available from the corresponding author on request.

Conflicts of Interest
Te authors declare that they have no conficts of interest.