Geographical Discrimination of Ground Amazon Cocoa by Near-Infrared Spectroscopy: Influence of Sample Preparation

Laboratory of Biotechnological Process (LABIOTEC), Graduate Program in Food Science and Technology (PPGCTA), Federal University of Para (UFPA), Belém 66095-100, Pará, Brazil Laboratory AdolphoDucke, Research Field of the Emı́lioGoeldi Museum Paraense, Perimetral Avenue, 1901, Terra Firme, Belém, Pará, Brazil Laboratory for the Extraction of Plant Products (LABEX), Graduate Program in Food Science and Technology (PPGCTA), Federal University of Para (UFPA), Belém 66095-100, Pará, Brazil Embrapa EasternAmazon, Belém 66095-100, Pará, Brazil


Introduction
Cocoa ( eobroma cacao L.) is one of the most important commercial crops in the world, contributing to the maintenance of inputs to various industries, such as food, pharmaceuticals, and cosmetics, contributing to the capital turnover of millions of dollars per year in addition to significantly influencing the world economy and income generation in several developing countries [1].
It is the primary raw material for the production of chocolate, which contains functional and bioactive groups, such as polyphenols, responsible for anticancer bioactivity [2], vasodilator [3], in addition to antioxidant bioactivity [4,5], and anti-inflammatory [6]. e quality of cocoa seeds and various food products is correlated with several aspects, including genotype, crop conditions, and climate. Knowledge of geographical origin is often recognized and appreciated by the food industry and consumers and is an important factor that can add value to cocoa beans [7].
Unfermented cocoa seeds have an estimated chemical composition of 1% organic acids, 1% caffeine, 1-2% theobromine, 2-3% sucrose, 2-3% cellulose, 4-6% pentosans, 4-6% starch, 5 to 6% polyphenols, 10 to 15% proteins, 30 to 32% lipids, and 32 to 39% moisture [1,8]. ese compounds' profiles differ even before fermentation, according to the region where they were cultivated. e fermentation and drying stages are crucial in cocoa processing, especially for the formation of color, aroma, and flavor of chocolate. ese steps lead to profound changes in the chemical structure of cocoa seeds, changing the profile of phenolic, sugars, peptides, and triglycerides, among others [9][10][11][12]. Fat is the main constituent of fermented cocoa seeds composition and its fatty acid profile is changed during fermentation and directly affects the texture, viscosity, melting behavior, aroma, and flavor of the food produced from it [13].
According to the report from the Executive Committee of the Cocoa Plantation Plan (Ceplac), the state of Pará reinforced its position as the largest cocoa producer in Brazil in 2017 [14], and this production is dispersed in different localities, which produce fermented cocoa seeds with different sensory characteristics.
Geographical identification can help in the quality control and traceability management of these seeds. is identification and geographical authenticity can be performed through specific laboratory analyses, which are most time-consuming and tedious and require chemical products that are sometimes harmful to the environment [15][16][17]. e infrared (IR) spectroscopy technique, particularly in the Near-Infrared Region (NIR), is a simple, fast, nonchemical waste method and requires minimal sample handling.
is method has been used effectively to determine the origin of various food and nonfood products [18].
Numerous food matrices have already been discriminated by their geographical origins using the NIR technique, among them, products such as honey [19], alcoholic beverages [20], mushrooms [21], green tea [22], butter [23], and wines [24]. e efficiency of the NIR technique has already been tested on cocoa from different regions of the world, and several components of the cocoa have been analyzed and correlated with the NIR method, such as color, volatile, phenolic compounds, antioxidants, fermentation index, and fat [7,17,[25][26][27][28][29].
In addition, the multivariate classification methods (LDA, KNN, BPANN, and SVM) have also been tested by Teye et al. [7] for the discrimination of cacao originated in different regions. In this study, Teye et al. [7] used 194 samples and evaluated the grain discrimination from seven different regions (Ashanti, Brong-Ahafo, Central, Eastern, Volta, northwest, and southwest), according to the different growing regions in Ghana. PCA was used to test the models, as it provides relevant information about the trend of the samples and the formation of clusters [17,[25][26][27][28][29]. e purpose of this article was to use the NIR technique to evaluate the influence of sample preparation on the geographical discrimination of cocoa fruits in the cities of Medicilândia, Tucumã, and Tomé-Açu, which are the main important producing regions in the state of Pará, Brazil.

Selecting Collection Regions.
e region's selection for the execution of this present work was based on the production volume and the geographical distinction between them. Cities from the three main producing regions of the state of Pará (Medicilândia, Tucumã, and Tomé-Açu) and geographically distant from each other were chosen. In the Transamazon region, southwest of Pará, the city of Medicilândia (geographic coordinates 03°26′45″ S and 52°53′20″ W) was chosen, while in the southeastern and northeastern Pará, the cities of Tucumã (geographic coordinates 06°44′52″ S and 51°09′39″ W and Tomé-Açu (geographic coordinates 02°28′41.3″ S and 48°16′50.7″ W) were chosen, respectively. e production quality was the second criterion, where the region of Medicilândia stands out in the international market for having superior quality. Raw cocoa samples were collected following the traditional harvesting practice of each region, choosing random fruits at the edges and the center of the plantation.

Sample Collection and Preparation
After the harvest, the fruits were broken and stripped; the cotyledons were separated and milled individually in a multifunctional mill (model A11, IKA, Staufen, Germany). e milled material was then sieved in a 600 μm mesh and stored at −22°C until the analysis.
Fermented cocoa samples were obtained after the fermentation and drying processes, according to the traditional methods of each producer in the three regions mentioned above. ese samples were dried to constant weight in an oven (model 80 series, Lucadema, São José do Rio Preto, SP, Brazil) at 80°C for moisture standardization. After the drying process, the dried fermented cocoa cotyledons were milled individually and then sieved in a 600 μm mesh and stored at −22°C until the analysis.
After the NIR analysis, the 117 samples of raw cocoa (RC) were dried to constant weight in an oven under air circulation at 80°C. After drying these samples, the spectra were again obtained.

Defatting of Cocoa Samples.
Dried fermented cocoa samples (56 samples) were exhaustively degreased with petroleum ether (Synth, Diadema, SP, Brazil) at a temperature of 55 ± 1°C (method 963.15, AOAC), for 24 hours, according to the Association of Official Analytical Chemists-AOAC [30]. After removing the fat, DF samples were kept for 24 hours at 80°C in an air circulation oven (model 80 series, Lucadema, Brazil) to remove the solvent. After solvent removal, dried fermented degreased cocoa samples (DFD) were stored at −22°C until analysis.

2.4.
Obtaining NIR Spectra. NIR spectra were obtained using an MPA FT-NIR spectrometer (Bruker Optics, Ettlingen, Germany). e spectral data were acquired using absorbance mode in the spectral range from 3500 to 12500 cm −1 , with 16 cm −1 resolution and an average of 32 scans per spectrum. For the samples spectral reading, vials with a volume of 3 ml were used with a capacity for about 1 g of the milled and sieved cotyledon. e spectral reading was performed at ±25°C.

Discrimination Model Development by Geographical
Origin.
e spectroscopy software used to construct discrimination models was OPUS 6.5 (Bruker Optics, Ettlingen, Germany). e spectral data were previously processed in OPUS 6.5, according to the type of sample. For the samples of cocoa RC, DF, and DFD, the vector normalization pretreatment (SNV-standard normal variate) was performed, while for the DU samples, SNV treatment was performed along with the first derivative.
After the application of pretreatments, the data were processed to develop the discrimination model by the geographical origin of each set of samples. e exploratory analysis method, PCA, was applied to multivariate data to make a visual inspection of results more evident. e spectral region that has best differentiated the geographic regions was chosen by the software operator, based on information related to the sample chemical composition, such as water content, fat, and protein.

Spectral Analysis.
e complete NIR spectra obtained from RC, DU, DF, and DFD samples are shown in Figure 1.
e prominent peaks can be observed around 9000-8000, 8500-6000, 6000-5000, and 5000-4500 cm −1 within the NIR range. In the RC spectrum (Figure 1(a)), two intense absorption peaks are observed, in 5312 and 7202 cm −1 , which are related to the water content in the samples, according to [7], corresponding to the region of the first Overton of the O-H stretch and − O-H deformation. e water absorption bands should be removed to reduce interference with the chemical structures corresponding to the group's CH and CO and combinations of amine groups [31]. With that, we could see in Figures 1(b) and 1(c) that after removing the water, the absorption peaks of other components in the samples became more evident.
None of the spectra presented in Figure 1 was possible to verify the differences between fruits from different analyzed regions, even in the fermented samples. ese differences were only noticed after mathematical treatments through the application of statistical analysis techniques for model development.

Discrimination Models for Raw Cocoa.
For the RC cocoa samples, it was impossible that the spectral band (6200-4000 cm −1 ) could discriminate groups of the fruits from different regions, as observed in Figure 2(a). e graph of the scores presented in Figure 2(b), referring to the spectral range between 4000 and 6200 cm −1 , shows the lack of differentiation between the three regions' scores. Bands referring to the phenol group, which can be observed in the regions corresponding to 3562-3322 cm −1 , aromatic ring related bands, referring to the region of 2925-2854 cm −1 (attributed to the CH stretch of the aromatic ring) and 1645-1544 cm −1 (attributed to C in the aromatic ring) [7], were overlapped. is effect is directly related to the amount of water and fats in the RC cocoa samples. erefore, high absorption of radiation NIR by water present in samples is a factor that contributes negatively to analysis, since the water spectra overlap with the other spectra of interest, making it challenging to construct an analytical and statistical model of geographic discrimination of the in-nature Amazonian cocoa.

Discrimination Models for Dried Unfermented Cocoa.
After the removal of water by drying, the DU samples have shown a trend toward the formation of groups related to discrimination by geographic region. e best spectral range for the DU discrimination was between 4300 and 9300 cm −1 observed in Figure 2(c). ese spectra located between 4300 and 4597, 4902-5199, 5805-6102, and 6406-9300 cm −1 in the whole spectrum for DU are associated with the carbonyl group spectral regions, corresponding to stretching combination (CH 2 and CH), first harmonic of CH present in the aromatic ring, combination of C-C and C-N and second harmonic of N-H [15]. ese molecular vibrations are caused by functional groups corresponding to polyphenols, alkaloids, vicilin class globulins, proteins, amines, acids, polysaccharides, and other aromatic compounds [7]. ese diverse functional groups are the digital identity of each sample from different Amazon regions. e differentiation between the samples from Tomé-Açu and Tucumã was evident, as observed in Figure 2(d). However, the scores related to the set of Medicilândia samples get mixed with the scores of the samples from the  Tomé-Açú  40  20  40  19  Tucumã  40  21  40  19  Medicilãndia  37  31  37  18  Total  117  72  117  56 other two regions. Several factors can explain the formation of characteristic groups of cocoa, such as chemical composition [32,33], degree of fermentation [34], and genotype [35], which may be associated with the region of origin of the seeds [36]. According to [37], cocoa, like other food products, has its characteristics influenced by locality.
is influence is directly related to the main components of alkaloid polyphenols, proteins, amines, polysaccharides, acids, and diverse aromatic compounds, as previously mentioned [7].

Discrimination Models for Fermented Cocoa.
e fermentation of cocoa beans is essential, due to the diverse microbial processes developed as a consequence of changes in temperature, pH, and oxygen availability, promoting significant biochemical changes in the type and concentration of flavor precursors in cocoa beans [37] which is a crucial step in the formation of quality sensory attributes. Chemical and biochemical complex changes and interactions occur in the cocoa beans during drying, storage, and fermentation, contributing to the complexity and identity of the Amazon cocoa [38]. e fermentation produces flavor and aroma precursors, such as free amino acids and peptides from enzymatic degradation of proteins and sugars that reduce the enzymatic degradation of sucrose, as well as a significant increase in volatile compounds such as organic acids, esters, alcohols, and aldehydes, after and during the fermentation of cocoa beans. e stoichiometry of their progenitors influences the concentration of these precursors; therefore, this is a response to the nutritional quality of the cultivated soil, from which the nutrients necessary for the synthesis of their parents were collected, revealing the good indicators of the production region [39].
Fermentation was a crucial factor in forming the characteristics of each region, because unlike the dried unfermented samples (DU), the dried fermented samples (DF), whose spectra are shown in Figure 2(e), allowed good discrimination between groups from different regions. e best spectral discrimination range observed was 6000 to 6800 cm −1 .
ese results allow one to verify that there is a strong influence of the fermentation on the biochemical characteristics of the samples, to allow a clear distinction between them, as shown in Figure 2(f ).
Caligiani et al. [10] reported that the progressive fermentation of cocoa seeds causes the hydrolysis of peptides in amino acids, reducing the violet color of the cotyledons at the end of the fermentation. e phenolic compounds are reduced upon drying; this reduction is mainly attributed to the enzymatic action, followed by nonenzymatic reactions due to quinone polymerization, with pH increase and high capitation of O 2 during sun drying [9]. e study of Sirbu et al. [12] estimated the increase of lipid content during fermentation from 3 to 6% when compared to the value before fermentation. In addition, the authors observed a change in the triacylglycerol profile caused by fermentation. Some polar triacylglycerols, such as derivatives of hydroxyl allyl fatty acid presented in     unfermented cocoa seeds, were not found in dried fermented cocoa seeds, thus allowing chemical differentiation between unfermented and fermented cocoa seeds. According to the literature, we can see that fermentation causes changes in the profile and concentration of various components of cocoa. Such components or concentrations of components can be characteristic for each region, and because of this, it is possible to carry out geographic identification. e NIR technique associated with chemometrics can previously identify the similarity and differences in various types of samples and classify them, making it possible to visually identify groups that have similar or distinct characteristics.
In this work, we used 15 external samples (not used in the construction of the model) to be able to test the efficiency of the constructed discrimination model (5 samples from Tomé-Açú, 5 from Tucumã, and 5 from Medicilândia). 100% of the samples were correctly classified according to their regions of origin.

Discrimination Models for Degreased Cocoa.
To evaluate the importance of the lipid content of the samples in geographic differentiation, it was made the degreasing of the DF cocoa samples and the spectral evaluation was carried out again, using the same spectral treatment applied to discriminate the nondegreased fermented samples. Scores of the degreased samples are shown in Figure 3.
As observed in Figure 3, even though the mathematical model treatment was used for DF samples (dried fermented), it was not observed the separation between the three groups; however, a partial separation between two regions (Tomé-Açu and Tucumã) was observed, which was the same as visualized in the DU samples. is shows that the fat removal made it difficult to classify the samples in three different geographic regions using the DFD sample discrimination model, validating partially the results of [40], who have shown a strong correlation of fatty acid composition in the geographic differentiation. In our research, we observed that it was possible to discriminate fermented samples more easily with their original butter constitution. e lipids and proteins storage cells in cocoa seeds have complex cytology, composed of a compacted cytoplasm with multiple vacuoles, where proteins and lipids can be found, as well as other components such as starch granules, which are important for the definition of specific flavor and aroma characteristics.
ese biochemical and cytological characteristics of the cells change in percentage and morphological characteristics from region to region, affected by several physical, chemical, and climatic factors [39].
Since there was an indication of partial separation of DFD samples, it was developed a model from two separation steps. In the first separation step, the spectral region of 5950 to 6835 cm −1 was used (Figure 4(a)), in which the regions of Tomé-Açu and Tucumã were separated (Figure 4(b)). In the second step of the model, the sample data for the two groups that did not show clear separation in the first step (Medicilândia and Tucumã), the spectral region of 7100 and 9300 cm −1 was used (Figure 4(d)). Two spectral regions were used to build the complete separation model of the DFD samples, as shown in Figure 5.
e DFD cocoa seed sample separation model was performed in two stages. For the first stage, it was used the spectral data of the cocoa from the three regions (Medicilândia, Tucumã, and Tomé-Açu) (Figure 4(b)), and for the second stage, it was used the spectral data of the cocoa from Medicilãndia and Tucumã, whose separation it was less evident in the first stage (Figure 4(c)). us, compared to the previous separation model (of DF seeds), we could observe Journal of Food Quality that the lipid fraction had considerable influence; however, it was not essential for geographic identification. According to Sirbu et al. [12], after fermentation, cocoa presents a different profile of triacylglycerol and starch granules, which influence the spectroscopic behavior. And after the fat removal, the other components that changed during fermentation remain there, making geographic differentiation possible [12]. e application of the NIR technique is quite simple; however, the path to acquiring spectral information and analytical data is time-consuming and costly. Once the database of spectral and analytical information is obtained, the mathematical model can be constructed through the use of computational algorithms that plan and/or optimize such experimental procedures. From this, this elaborated model can be used to analyze external samples only with spectral information.
e spectral data will be correlated with the data previously obtained and the result is estimated. In this way, the analysis becomes simple, fast, and economical. It is enough that, after the construction of the prediction model, the spectra of the material in question are obtained and the model is applied.

Conclusions
e results showed that it was possible to evaluate the influence of sample preparation on the geographic discrimination of cocoa from different geographic regions, in the Pará state, using the NIR technique. e importance of the fermentation process for geographic identification was observed, and it was noticed provided specific characteristics for each region evaluated. e fermented cocoa lipids were crucial in the formation of groups in the fermented samples, also indicating that fermentation is a crucial step to develop specific characteristics for each collection region. us, it is assumed that the biotas of each region should affect the fatty acids contained in cocoa butter differently since similar results were not found in defatted samples. e effect of the climate and production process, although not directly explicit, cannot be ruled out, considering that they can significantly influence the type of biota and the fermentation process, as well as other possible variables.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval
Ethics approval was not required for this research.

Conflicts of Interest
e authors declare that there are no conflicts of interest.