Navel Orange Maturity Classification by Multispectral Indexes Based on Hyperspectral Diffuse Transmittance Imaging

Maturity grading is important for the quality of fruits. Nondestructive maturity detection can be greatly beneficial to the consumer and fruit industry. In this paper, a hyperspectral image of navel oranges was obtained using a diffuse transmittance imaging based system. Multispectral indexes were built to identify the maturity with the hyperspectral technique. Five indexes were proposed to combine the spectra at wavelengths of 640, 760 nm (red edges), and 670 nm (for chlorophyll content) to grade the navel oranges into three maturity stages. The index of (T 670 + T760 − T640)/(T670 + T760 + T640) seemed to be more appropriate to classify maturity, especially to distinguish immature oranges that can be straightly identified in accordance with the value of this index ((T 670 +T 760 − T640)/(T670 +T760 +T640)). Different indexes were used as the input of linear discriminate analysis (LDA) and of k-nearest neighbor (k-NN) algorithm to identify the maturity, and it was found that k-NN with (T670 +T760 −T640)/(T670 +T760 +T640) could reach the highest correct classification rate of 96.0%. The results showed that the built index was feasible and accurate in the nondestructive classification of oranges based on the hyperspectral diffuse transmittance imaging. It will greatly help to develop low-cost and real-time multispectral imaging systems for the nondestructive detection of fruit quality in the industry.


Introduction
Fruit quality plays a vital part in marketing, and the quality and harvest time of oranges generally depend upon the experienced farmer who has the ability of on-tree visual inspection [1].However, as the maturity might be influenced by many factors, artificial identification could result in an inappropriate harvest time before the fruit matures commercially [2].Meanwhile, it is necessary to handle and process the fruits after harvesting.So, it is significant to promote the development of maturity grading with nondestructive analytical methods for classifying the fruits into different commercial applications.
Hyperspectral imaging technique is one of the nondestructive technologies which takes advantage of spectroscopic and imaging techniques, providing spectral and spatial information simultaneously [3].So, spatial distribution information of a chemical entity can be obtained which is based on the spectral analysis at each pixel.As a result, each hyperspectral image contains a large amount of information in a three-dimensional (3D) form called "hypercube."And the object can be characterized more reliably by the hypercube than by the traditional machine vision or spectroscopy techniques [4].
Because each hyperspectral image contains a large amount of information, it is extremely critical to choose the characteristic bands for collecting effective information about the quality attribution.Many algorithms have been applied on the characteristic bands selection in accordance with some quality parameters, that is, soluble solids content (SSC), titratable acidity (TA), and firmness.Now, there have been some studies about the quality of thin-skin fruits.Cen et al. [5] applied supervisory classification algorithms combined with feature-selecting methods to identify chilling injury to cucumbers with the hyperspectral imaging technique.Li et al. [4] detected the defects on peach skin using hyperspectral imaging (325-1100 nm).Based on the proposed multispectral algorithm, the overall detection accuracy for the tested samples was 96.6%.Generally, the hyperspectral imaging detection is conducted in reflection mode.By this means, it is not easy to collect the internal information of thick-skin fruits such as oranges.Meanwhile, owing to the smooth skin of oranges, light spots often appear in the obtained images, causing difficulties in further analysis.So, this study could avoid the light spots generating with diffuse transmittance mode in reflection mode.Meanwhile, diffuse transmittance reduces the influence of shape, size, and core of fruit by adjusting the lighting angle in diffuse transmittance systems so as to reduce the stray radiation [6].
In studying fruits maturity discrimination, Sun et al. [7] identified Hami melon with hyperspectral imaging technology in combination with characteristic wavelengths selection methods and SVM.At least 9 characteristic variables were selected.Zhang et al. [8] used the spectra and texture feature at 6 characteristic wavelengths in 6441.1-1013.97nm to classify strawberry maturity with accuracy of over 85%.In those studies, some algorithms were adopted to select important wavelengths and more than 3 wavelengths were kept.Maturity could be evaluated comprehensively by various chemical components; some researchers believed that some wavelengths of pigments or water content are closely related to maturity [9].Qin and Lu [10] correctly classified tomatoes into three ripeness groups based on the ratio of the absorption coefficient at 675 nm (for chlorophyll content) to that at 535 nm (for anthocyanins).Schouten et al. [11] found the correlations between chlorophyll content in pericarp tomato tissue and NDVI (( 780 −  660 )/( 780 +  660 )) in remittance VIS spectroscopy method.
In industrial application, high cost of the equipment and slow processing time of images remain to be the main obstacles for the development of hyperspectral imaging in fruits detection [12].The multispectral vision camera is considered as a solution, as it is a cheaper system and it takes less time in processing because of few (three or four) needs for wavelengths and images [13].Hyperspectral imaging could be applied in the laboratory to compare different wavelength combinations.Qin et al. [14] developed a prototype for realtime inspection of citrus canker based on two wavelengths cantered at 730 and 830 nm.The two-band spectral imaging module was integrated with the machine's sorting capacity to carry out the online inspection of canker with samples moving at a speed of 5 fruits/s, and the system presented an overall classification accuracy of 95.3%.The wavelengths used in the prototype were identified previously from hyperspectral reflectance images [15,16].
So, for selecting multispectral indexes that could be applied to develop low-cost and real-time imaging systems, this study aimed to find the indexes appropriate for the maturity classification of navel oranges based on hyperspectral diffuse transmittance imaging.The present paper proposes some indexes based on some specific wavelengths and combined with the pattern recognition methods to maturity identification.The proposed multispectral index will effectively reduce the calculation time in high-dimensional data processing and it will be helpful for online detective system development.

Fruit Material.
One hundred and fifty samples at different maturity stages (assessed in the field by an artificial labeling method based on the period of growth from blossom) were harvested from the local farms in Jiangxia, Wuhan Province, China (30 ∘ 32  N, 114 ∘ 32  E).Navel oranges (Citrus sinensis Osbeck cv.Robertson) in different maturity stages were picked up as follows: (I) immature, 180 days from blossom; (II) midmature, 200 days from blossom; (III) mature, 220 days from blossom [17].Firstly, all of the intact fruits were cleaned and numbered and then were stored at 25 ∘ C and 60% relative humidity within 24 h before processing.Fruit's parameters in terms of weight and appearance were measured prior to the acquisition of transmittance hyperspectral images.

Hyperspectral Images Acquisition.
For hyperspectral diffuse transmittance imaging, a laboratory-type spectrum measurement device was designed (Figure 1).The system mainly consisted of a high-performance back illuminated CCD camera (Andor, Clara, DR-328G, Britain), an imaging spectrograph (SPECIM, V10E-CL, Finland) covering the spectral range of 390-1,055 nm, and an assembled light unit containing four 50 W quartz tungsten halogen lamps (Oriel Instruments, 6332, USA) with eight fans to dissipate heat.A mobile platform with its speed controlled by a computer equipped with a spectral image system (Spectral SECN-V17E) was used to set and adjust the parameters of the device, including exposure time, motor speed, imaging acquisition, and wavelength range.The spectral resolution is 2.8 nm; the resolution of the CCD camera is 672 × 512 (spatial × spectral) pixels.In this work, the moving speed of the mobile platform is 2.0 mm/s, and the exposure time of the CCD camera is 0.1 s.The whole system was assembled in a dark chamber.The acquired images were corrected with white and dark images using (1) as follows: where  is the correct hyperspectral image,  sample is an original uncorrected hyperspectral image,  white is the image of the white reference which was obtained by a Teflon ball, and  dark is the image acquired by the system in the absence of lighting.
2.3.Data Analysis.Spectra were collected from the images by the ENVI (Version 4.7, ITT Visual Information Solutions, Boulder, USA) software.Circle region of interest (ROI) was applied to collect the average spectra and the spectra preprocessed by Savitzky-Golay smoothing.The classification models were developed with the linear discriminant analysis (LDA) and -nearest neighbor (-NN) algorithm.LDA is a common supervised identification method used in statistics, pattern recognition, and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events [18].-NN is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions) [19].The calculation process is conducted in Matlab 7.11.0R2010b (MathWorks, Natick, USA).The classification results were evaluated by the correct classification rate (CCR).

Multispectral Indexes Establishment.
Regularly, the multispectral indexes consist of the differences or ratios of reflectance or transmittance at some wavelengths.Regarding the decimation of fruits ripeness, in several studies, chlorophyll content and water bands have been adopted, which were considered as the most related to the maturity, such as 680, 800, 900, and 950 nm [20].Qin and Lu [10] found that large differences in the absorption spectra were observed for the tomatoes at three ripeness stages, and their ripeness was correctly identified using the ratio of the absorption coefficient at 675 nm (for chlorophyll content) to that at 535 nm (for anthocyanins).Lleó et al. [13] made a comparison of multispectral indexes extracted from hyperspectral images to assess the maturity of fruit; they proposed four indexes: Among the many applications of spectral data, Ye et al. [21] used the two-band vegetation index (TBVI) to identify the best two-band-based predictor of citrus yield.TBVI can be calculated by   1 −   2 /  1 +  2 .The TBVI based on the 823 nm (NIR) and 728 nm (red edge) wavelengths was found to provide optimal citrus yield information ( 2 = 0.5795, RRMSE = 0.6636).In accordance with these references, multispectral indexes will be established by analyzing the character of spectra.

Results and Discussion
3.1.Spectra and Multispectral Indexes Considerations.The average spectra of oranges at three maturity stages are shown in Figure 2. As the noises and irrelevant information existed in the beginning and the end of spectra, only spectra in the range from 550 to 1,000 nm were used for analysis.From Figure 2, the main differences are shown at the wavebands of 550-700 nm and 750-780 nm.The main difference is the chlorophyll content absorption hole (around 700 nm), disappearing as the fruit ripens with the position of peak change.Meanwhile, 760 nm was also found which could be due to the third overtone of O-H and the fourth overtone of C-H; the O-H and C-H functional groups are related to the concentration of some inner compositions with these bands such as soluble solid contents [22].So, 640 nm, 760 nm (red edges), and 670 nm (for chlorophyll content) were considered as the characteristic wavelengths in this study.Refer to Section 2.4; five multispectral indexes were established as follows:

Sample Measurements.
Table 1 shows some parameters of the set samples of calibration and prediction.These parameters of 150 samples did not reflect the obvious correlations to the maturity through correlation analysis.The mass and size of the samples in different maturity stages are closely correlated.Minimizing the differences in mass and size of fruits can reduce the individual effect of the fruits on the spectral images acquirement.Variation of SSC in oranges in different maturities was significant ( < 0.05), while the SSC increased from immature to mature fruits.

Analysis of Multispectral
Indexes.The analysis of the multispectral indexes is shown in Table 2. ANOVA showed that the indexes have significant variation ( < 0.05), so all the proposed multispectral indexes are statistically significant.And Figure 3 shows the distribution of the five indexes of the calibration set.It shows that there was no significant discrepancy in the indexes  1 ,  2 ,  3 , and  4 among the immature, midmature, and mature groups.The mature group is the most widely distributed in the four indexes and the immature group is the most concentrated relatively.But for the index  5 , it had good classification ability, especially for the immature group.The value of  5 for the immature group ranges from 1.16 to 40.7, but the value is less than 0.16 for the midmature and mature groups.By  5 , we can clearly distinguish the immature oranges from the midmature and mature oranges.From Figure 3, the CCR of immature samples could be 100%.For other indexes, it is hard to discriminate the navel oranges in different ripening stages from the figure directly.
3.4.Classification Models.Different indexes were used as the different models' inputs; the results of each model are shown in Table 3.According to the distribution of the different indexes, we can speculate that the predictive performance of index  5 will be the best one.And the classification results showed that the best identification was conducted by  5 using -NN algorithm with CCR of 100% and 96.0% for calibration and prediction set, respectively.In the  5 -based LDA method, the CCR of calibration and prediction set was 83% and 78%, respectively.The main error appeared in the discrimination of the immature group, but, obviously, immature samples can be distinguished from other oranges in different maturity stage by the value of  5 without modeling.So, the immature samples will be firstly chosen out and then the LDA is applied to classify the midmature and mature samples.The performance of classification for midmature and mature groups has been improved.The main misclassified samples are shown in Figure 4.The CCR for the two groups' discrimination is 88.2% (four midmature samples  were misidentified).If all the immature samples were considered as the correct classification, the whole predictive CCR is improved to 92.0% which is still worse than the predictive CCR of -NN.Compared to the results of some researches about the nondestructive maturity detection of oranges or citrus, 96% CCR is better than the 91.67% identification accuracy based on machine vision carried out by Ying et al. [23].And it is also better than the correctness of inspection test, 82%, based on multifractal spectra obtained by Cao [24].

Conclusion
In this study, multispectral indexes are to be established for maturity classification of navel oranges based on the diffuse transmittance hyperspectral imaging.The indexes calculated by ( 670 +  760 −  640 )/( 670 +  760 +  640 ) showed a good performance for maturity detection with the CCR of 96.0% by -NN method.And particularly this index has an excellent capability of distinguishing the immature oranges.In reality, it is quite practical in developing a portable instrument which can quickly identify the immature oranges with high precision.As only three important wavelengths are needed, it will be beneficial for the development of low-cost and real-time multispectral imaging systems for industrial applications.
But in the in-field application, appropriate lighting resource angle and layout could have a great effect on the quality of the obtained hyperspectral images.So, further work will be focused on the light resource device development for in-field application.

Figure 1 :
Figure 1: Schematic diagram of the used diffuse transmittance hyperspectral imaging system for navel oranges maturity classification.

Figure 2 :
Figure 2: Average spectra of navel oranges at three different maturity stages.

Figure 3 :
Figure 3: Multispectral indexes distribution of navel oranges at three different maturity stages.

Table 1 :
Some physical parameters of navel oranges at three different maturity stages.

Table 2 :
The statistical analysis of the five indexes.

Table 3 :
The classification results of the different models for the prediction set.