This paper proposed the joint use of Fourier Transform Infrared Attenuated Total Reflectance Spectroscopy (FTIR-ATR) and Partial Least Square (PLS) regression for the simultaneous quantification of four adulterants (coffee husks, spent coffee grounds, barley, and corn) in roasted and ground coffee. Roasted coffee samples were intentionally blended with the adulterants, at adulteration levels ranging from 0.5 to 66% w/w. A robust methodology was implemented in which the identification of outliers was carried out. High correlation coefficients (0.99 for both calibration and validation) coupled with low degrees of error (0.69% for calibration; 2.00% for validation) confirmed that FTIR-ATR can be a valuable analytical tool for quantification of adulteration in roasted and ground coffee. This method is simple, fast, and reliable for the proposed purpose.
New and challenging risks, such as adulteration, have emerged as food supply chains become increasingly global and complex, although fraud in the food sector has been an issue since ancient times. Food adulteration tends to be economically motivated and is achieved through addition, substitution, or removal of food ingredients. It is an issue that concerns not only consumers, but producers and distributors as well [
Coffee is one of the most valuable and most commonly consumed beverages in the world. Due to its high price, this commodity is usually targeted for adulteration. Impurities and adulterants are the most common concern. Any low-cost material of biological origin could be used as a potential adulterant in coffee [
In order to develop analytical tools suitable to detect and identify adulteration in roasted and ground coffee, different techniques and procedures have been proposed, including UPLC [
In previous studies, we have shown that Diffuse Reflectance Fourier Transform Infrared Spectroscopy (DRIFTS) is suitable for identification, discrimination, and quantification of adulterants in roasted and ground coffee [
Arabica coffee, barley, and corn samples were acquired from local markets. Coffee husks were provided by the Minas Gerais State Coffee Industry Union (Sindicato da Indústria de Café do Estado de Minas Gerais, Brazil). Spent coffee grounds were provided by a local soluble coffee manufacturer (Café Brasília, Minas Gerais, Brazil).
Coffee beans (50 g), coffee husks (30 g), barley (50 g), and corn (30 g) samples were roasted in a convection oven (Model 4201D Nova Ética, São Paulo, Brazil) at temperatures ranging from 200 to 260°C, under different time intervals. Roasting degrees (light, medium, and dark) were established by comparing luminosity (
Mass composition of adulterated coffee samples.
Samples | Adulteration level (%) | Mass fraction (%) | ||||
---|---|---|---|---|---|---|
Coffee | Spent coffee grounds | Coffee husks | Barley | Corn | ||
1 | 66 | 33.3 | 33.3 | 33.3 | ||
2 | 50 | 50 | 50 | |||
3 | 50 | 50 | 50 | |||
4 | 40 | 60 | 10 | 10 | 10 | 10 |
5 | 40 | 60 | 20 | 20 | ||
6 | 40 | 60 | 20 | 20 | ||
7 | 20 | 80 | 5 | 5 | 5 | 5 |
8 | 20 | 80 | 10 | 10 | ||
9 | 20 | 80 | 10 | 10 | ||
10 | 10 | 90 | 5 | 5 | ||
11 | 10 | 90 | 5 | 5 | ||
12 | 10 | 90 | 3.33 | 3.33 | 3.33 | |
13 | 10 | 90 | 10 | |||
14 | 10 | 90 | 10 | |||
15 | 10 | 90 | 10 | |||
16 | 10 | 90 | 10 | |||
17 | 1 | 99 | 1 | |||
18 | 1 | 99 | 1 | |||
19 | 1 | 99 | 1 | |||
20 | 1 | 99 | 1 | |||
21 | 2 | 98 | 1 | 1 | ||
22 | 2 | 98 | 1 | 1 | ||
23 | 2 | 98 | 1 | 1 | ||
24 | 2 | 98 | 1 | 1 | ||
25 | 4 | 96 | 1 | 1 | 1 | 1 |
26 | 4 | 96 | 2 | 2 | ||
27 | 4 | 96 | 2 | 2 | ||
28 | 8 | 92 | 2 | 2 | 2 | 2 |
29 | 8 | 92 | 4 | 4 | ||
30 | 8 | 92 | 4 | 4 | ||
31 | 0.5 | 99.5 | 0.5 | |||
32 | 0.5 | 99.5 | 0.5 | |||
33 | 0.5 | 99.5 | 0.5 | |||
34 | 0.5 | 99.5 | 0.5 |
All measurements were performed in a dry controlled atmosphere (20 ± 0.5°C) employing a Shimadzu IRAffinity-1 FTIR Spectrophotometer (Shimadzu, Japan) with a deuterated L-alanine-doped triglycine sulfate (DLATGS) detector. A Pike sampling accessory (MIRacle), with zinc selenide window, was employed for the ATR measurements. All spectra were recorded in the range of 4000–700 cm−1 with 4 cm−1 resolution and 20 scans and submitted to background subtraction (atmosphere spectra). Preliminary tests were performed to evaluate the effect of particle size (0.39 mm <
Because the 34 solid mixtures were manually prepared, five replicates of each sample were obtained in the FTIR-ATR using different parts of each sample, in order to ensure representativity. Therefore, a total of 170 spectra were obtained for adulterated samples.
MATLAB software, version 7.13 (MathWorks, Natick, MA, USA), and PLS Toolbox version 6.5 (Eigenvector Technologies, Manson, WA, USA) were employed for data analysis. PLS was employed for quantification of adulterants mixed in roasted coffee samples using the ATR spectra as chemical descriptors, with adulteration levels ranging from 0.5% to 66% in mass (see Table
The data were submitted to two sequential evaluations. The first was focused on the efficiency of different data preprocessing applications. The second was related to the importance of the variables in the quantification process. In this step, different spectra ranges were evaluated in order to check if the use of specific region could improve the quality of the model.
The purpose of preprocessing is to linearize the response of variables and remove extraneous sources of variation (variance), which are not of interest in the analysis. Interfering variance appears in almost all real data because of systematic errors present in the experiment, requiring the model to work harder [
Mean centering corresponds to subtraction of the average absorbance value of a given spectrum from each data point. Multiple scatter correction (MSC), originally developed to compensate the effects of light scattering in reflectance spectroscopy, has become a widely employed technique for removing general spectra drift features such as day-to-day intensity variations. Spectra derivatives are commonly used for baseline correction, because they provide visualization of small peaks that are difficult to detect in the original spectra. However its application also leads to a decrease in signal/noise ratio and thus a smoothing filter (Savitzky-Golay) was employed to provide noise reduction. SNV is applied to every spectrum individually; once the average and standard deviation of all the data points of the spectra are calculated, every data point is subtracted from the mean and divided by the standard deviation. Absorbance normalization consisted in dividing (i) the difference between the absorbance value at each data point and the minimum absorbance value by (ii) the difference between the maximum and minimum absorbance values [
The optimal number of latent variables (LV) for each model was estimated by a cross-validation method (venetian blinds), based on the smallest value of root mean square error of cross-validation (RMSECV). Model performance was measured by evaluation of the root mean square errors for both calibration (RMSEC) and validation (RMSEP) sets, calculated as follows:
Model optimization was performed by detection and elimination of outliers. Outliers correspond to samples that are very different from the rest of the data set, and their detection is crucial when developing multivariate models. In this study, outlier detection in the calibration set was based on the methodology proposed by Valderrama et al. [
Table
Performance results of full-spectrum PLS models based on different data preprocessing techniques.
Data before treatment | LV | RMSEC (%) |
|
RMSEP (%) |
|
---|---|---|---|---|---|
Mean centering (MC) | 5 | 3.80 | 0.97 | 3.56 | 0.98 |
Multiple Scatter Correction (MSC) + MC | 7 | 1.70 | 0.99 | 2.53 | 0.99 |
MSC + first derivatives + smoothing + MC | 4 | 2.54 | 0.99 | 3.37 | 0.98 |
Standard Normal Variates (SNV) + MC |
|
|
|
|
|
SNV + first derivatives + CM | 6 | 1.78 | 0.99 | 2.68 | 0.99 |
Absorbance normalization + MC | 8 | 1.67 | 0.99 | 2.52 | 0.99 |
First derivatives + MC | 6 | 2.39 | 0.99 | 3.00 | 0.98 |
LV: latent variables;
The next step was to evaluate if the selection of a specific spectral range could improve prediction accuracy, given that the full spectra could present some systematic variables that do not necessarily represent samples variance. For this reason, the plot of correlation coefficient that provided the main regions responsible for the quantification process is shown in Figure
Full-spectrum regression coefficients (4000–700 cm−1) of the PLS model based on the data submitted to SNV followed by mean centering.
Average normalized ATR spectra obtained for roasted coffee (brown color), roasted coffee husks (pink color), spent coffee grounds (blue color), roasted barley (yellow color), and roasted corn (green color).
An evaluation of the coefficients shown in Figure
In view of the aforementioned, the tested ranges were 4000–700 cm−1 (full spectra), 1735–700 cm−1, and 1135–700 cm−1. New models were built using these selected regions and the data were submitted to SNV and mean centering as preprocessing strategies. Table
Performance results of PLS models based on different data ranges.
Wavenumber range (cm−1) | LV | RMSEC (%) |
|
RMSEP (%) |
|
---|---|---|---|---|---|
4000–700 (full spectra) |
|
|
|
|
|
1735–700 | 7 | 1.74 | 0.99 | 2.77 | 0.98 |
1134–700 | 5 | 2.80 | 0.98 | 3.90 | 0.97 |
LV: latent variables;
As the best PLS model obtained was built with full spectra and its data were submitted to SNV and mean centering, the next step was to optimize it by using the procedure for detection of outliers. The outliers were detected at 99% confidence level, and the results are summarized in Table
Optimization of PLS model by detection and removal of outliers.
Model | 1st | 2nd | 3rd | 4th |
---|---|---|---|---|
Number of calibration samples | 102 | 92 | 86 |
|
Number of validation samples | 68 | 68 | 68 |
|
LV | 8 | 7 | 8 |
|
RMSEC (%) | 1.44 | 1.19 | 0.82 |
|
RMSEP (%) | 2.42 | 5.16 | 4.76 |
|
|
0.99 | 0.99 | 0.99 |
|
|
0.99 | 0.96 | 0.96 |
|
LV: latent variables;
(a) Experimental versus predicted values of adulteration (% w/w) of coffee samples based on the optimized PLS model after outlier removal. (b) Residual versus adulteration levels (% w/w) of coffee samples based on the optimized PLS model after outlier removal.
A comparison of the model obtained in the present study with the one based on DRIFTS [
Comparison of the performances of models based on DRIFTS and FTIR-ATR.
Optimized models | DRIFTS [ |
FTIR-ATR |
---|---|---|
Number of calibration samples | 88 |
|
Number of validation samples | 44 |
|
LV | 10 |
|
RMSEC (%) | 1.61 |
|
RMSEP (%) | 2.34 |
|
|
0.99 |
|
|
0.98 |
|
LV: latent variables;
PLS models of ATR spectra were successfully developed. The optimized model was built with full spectra (4000–700 cm−1) that were submitted to SNV and mean centering as data preprocessing strategy. It was capable of predicting adulteration levels ranging from 0.5% to 40%. For this final model, the determination coefficients were 0.99 for both calibration and validation sets, and the errors observed during calibration and validation were quite low, 0.69% and 2.00%, respectively. It can be concluded that because the use of the full spectrum provided more robust models, the detection of adulteration and discrimination of adulterated and nonadulterated coffee samples cannot be attributed to a single class of components, rather being dependent on a variety of compounds, such as lipids, chlorogenic acids, caffeine, and polysaccharides. PLS and FTIR-ATR proved to be promising techniques, suitable for quantification of multiple adulterants in roasted and ground coffee.
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors acknowledge financial support from the following Brazilian government agencies: CAPES, CNPq, and FAPEMIG.