Rapid Characterization of Tanshinone Extract Powder by Near Infrared Spectroscopy

Chemical and physical quality attributes of herbal extract powders play an important role in the research and development of Chinese medicine preparations. The active pharmaceutical ingredients have a direct impact on the herbal extract's efficacy, while the physical properties of raw material affect the pharmaceutical manufacturing process and the final products' quality. In this study, tanshinone extract powders from Salvia miltiorrhiza which are widely used for the treatment of cardiovascular diseases in the clinic are taken as the research object. Both the chemical information and physical information of tanshinone extract powders are analyzed by near infrared (NIR) spectroscopy. The partial least squares (PLS) and least square support vector machine (LS-SVM) models are investigated to build the relationship between NIR spectra and reference values. PLS models performed well for the content of crytotanshinone, tanshinone IIA, the moisture, and average median particle size, while, for specific surface area and tapped density, the LS-SVM models performed better than the PLS models. Results demonstrated NIR to be a valid and fast process analytical technology tool to simultaneously determine multiple quality attributes of herbal extract powders and indicated that there existed some nonlinear relationship between NIR spectra and physical quality attributes.


Introduction
Pharmaceutical powders, described as heterogeneous systems with different chemical and physical attributes, are the main source of oral solid preparations. It is estimated that more than 80% of the drug production is based on powders in a typical pharmaceutical industry [1]. As the input of the pharmaceutical process, powders have great impacts on the whole production process. Tiny quality fluctuations of powders may result in batch to batch variations of the final products [2][3][4]. For example, the performance of the granule tableting process would be deteriorated significantly when the initial moisture content of microcrystalline cellulose was increased from 2.6% to 4.9%, the values of which are considered to be within normal variations of the moisture content (i.e., 3-5%) for microcrystalline cellulose. On the other hand, the flow ability of the granule was improved as the initial moisture content of microcrystalline cellulose was increased [4]. Therefore, it is necessary to understand and control the critical material attributes of powders at the beginning of pharmaceutical processes.
Process analytical technology, launched by the United States Food and Drug Administration [5], is often used to monitor and control critical quality attributes of raw materials and in-process products to ensure the quality of final products. Process analytical technology approaches, which are based on scientific knowledge and risk analysis, afford the design and development of efficiently controlled process. In this way, it is possible to realize the preset target of the product when the manufacturing process is finished. Common process analytical technology tools used in rapid evaluation of chemical and physical properties of powder are as follows: near infrared spectroscopy [6,7], Raman spectroscopy [8], Raman chemical imaging [9], acoustic emission [10,11], and so forth. Among them, near infrared spectroscopy (NIR) is the most widely used process analytical technology tool in the pharmaceutical process monitoring and control [12,13], since it is fast, nondestructive, and of lowcost. Compared with the imaging technology, NIR is rapid and of low-cost. And for the analysis of complex system with many ingredients, that is, herbal materials [14,15], NIR shows 2 International Journal of Analytical Chemistry unique advantages over other spectroscopy technologies, such as Raman spectroscopy which does well in the analysis of pure compounds [16].
NIR spectra carry abundant information not only on chemical compositions but also on physical properties (e.g., particle size) of the sample [17,18]. In powder analysis, some qualitative work with NIR has been reported, such as rapid identification of the production area [19] and brand traceability [20]. And some work about quantitative analysis has also been done in relating the NIR spectra to different quality attributes of powders, such as content of ingredients [6], flow ability [21], content of moisture [22], and particle size [23]. And subsequent unit operations of solid dosage preparations could benefit from the quality control of powders.
Herbal extract powders as raw materials play an important role in the research, development, and manufacturing of Chinese medicine preparations. Currently, the quality control of herbal extract powders mainly focuses on the content of active pharmaceutical ingredients according to the Chinese Pharmacopoeia (Ch.P.) 2010 [24]. However, like powders of chemical materials, it is far from enough to control the quality of herbal extract powders only by the content of components. In order to understand and control the quality of herbal extract powders as well as related manufacturing processes and products, physical properties of herbal extract powders should be paid more attention. Generally, classical methods for determination of physical properties of herbal extract powders were time-consuming. And, recently, NIR spectroscopy has been reported to be successfully applied in the analysis of herbal powders, where the contents of active pharmaceutical ingredients were the attractive part. Quantification of contents of two or more ingredients in herbal extract powders could be carried out by NIR [15,25]. Nevertheless, the application of NIR to analysis of physical properties of herbal extract powders is still an untouched area. Therefore, the aim of this paper is to investigate the possibility of using NIR to predict both the chemical and physical properties of herbal extract powders at the same time. To the best of our knowledge, this is the first report on the application of NIR spectroscopy in characterization of multiple quality attributes of herbal extract powders. It is expected that the usage of NIR could be broadened in the quality control of raw materials of herbal products.
The rest part of the paper is organized as follows: firstly, contents of cryptotanshinone and tanshinone IIA of 50 batches of tanshinone extract powders were determined by high performance liquid chromatography (HPLC), and the physical quality attributes were measured by classical methods. Then, the NIR spectra of tanshinone extract powders were collected in diffuse reflection mode and different data pretreatment methods were screened. After that, partial least squares (PLS) and least square support vector machine (LS-SVM) models for quantitative prediction of different quality attributes were built, and the performances of these models were compared. Finally, a conclusion of this paper is provided.  Table 1, and the experiment schedules are listed in Table 2. The alpha value was set to 1 and the replication of center points was 5. The extraction process was carried out according to procedures specified in the Chinese Pharmacopoeia (Ch.P.) 2010 [24] as follows: pulverized powders of Salvia miltiorrhiza were extracted by alcohol through heating reflux, after which the alcohol was filtered and the filtrates were merged. Then, the filtrate was vacuum evaporated to recover ethanol, enriched to 1.30∼1.35 of the relative density at 60 ∘ C, washed to colorless by hot water, dried at 80 ∘ C, and crushed to fine powders finally. As a result, 50 batches of alcohol extracts of Salvia miltiorrhiza were used in the following experiments.

Method
3.1. HPLC Analysis. 5 grams of tanshinone extract powders was taken to a 5 mL volumetric flask after a precise weighing, dissolved by methanol, and then diluted with methanol to volume. All samples were filtered through a millipore membrane filter with an average pore diameter of 0.45 m, and 10 L filtrate was injected into the HPLC system for analysis.
The contents of cryptotanshinone and tanshinone IIA were quantitated by the reverse phase HPLC according to Ch. P. 2010 [24]. An Agilent 1100 HPLC system (Agilent Technologies, Santa Clara, California, USA) with a vacuum degasser, a quaternary pump, an autosampler, a thermostatic column compartment, and a diode array detector were used. Separation was performed on Agilent SB C 18 column (250 mm × 4.6 mm with 5 m particle size) at 30 ∘ C. The mobile phase consisted of (A) acetonitrile and (B) 0.026% phosphoric acid water solution. The gradient elution was as follows: linear change from (A) 0 to 60% at 0-20 min and linear change from (A) 60% to 80% at 20-50 min. The signal was monitored at 270 nm. The flow rate was maintained at 0.8 mL⋅min −1 . Reequilibration duration was 10 min between individual runs.

NIR Spectroscopy.
The NIR spectra were collected in the integrating sphere mode using an Antaris Nicolet FT-NIR system (Thermo Fisher Scientific Inc., Waltham, Massachusetts, USA). About 2 grams of powders was used with compaction in each test. Each sample spectrum was a result of 64 scans in the range between 10,000 and 4000 cm −1 using 8 cm −1 resolution at ambient temperature and was recorded by log 1/ with air as reference. Every sample was scanned three times and the final spectrum was an average of the three. All NIR spectra were collected and archived using the Thermo Scientific Result software.

Physical Attributes
Determination. The specific surface area was determined by the 3h-2000a automatic specific surface area analyzer (Beishide Instrument Technology (Beijing) Co., Ltd., Beijing, China) according to the multimolecular layer absorption theory. In reference mode, with nitrogen as the absorbate and purge medium and helium as carrier gas, the test was purged for 60 minutes at 30 ∘ C.
The particle size distribution was determined by the bt-2001 laser particle size analyzer (Dandong Bettersize Instruments Ltd., Dandong, Liaoning, China). Based on the light scattering theory, measurements were obtained using dry dispersion with air as medium and the refractive index of sample is 1.520. The 10 , 50 , and 90 values are calculated to represent the maximal particle size diameters that include 10%, 50%, and 90% of the particles, respectively. For example, the 90 value means that 90% of particles are smaller than this particle diameter, whereas the remaining 10% of the particles have larger diameters. Each sample was tested for three times, and the average value was taken.
The tapped density was analyzed by the hy-100 powder density tester (Dandong Hengyu Instruments Ltd., Dandong, Liaoning, China). According to the European Pharmacopoeia 8.0 [26], 5 grams of powders was poured into the measuring cylinder. Afterwards, the volumetric measurement was made following 1250 taps, which has been described as the number of taps sufficient to achieve maximum compaction equilibrium [27]. The final volume (Vt) was used to compute the tapped density. Each sample was measured in triplicate, and the average value of density was taken.
The moisture content of sample was determined by the Sartorius ma-35 moisture analyzer (Sartorius AG, Gottingen, Germany). This test needs about 2 grams of powders being heated for 10 minutes at 105 ∘ C.

NIR Spectra Pretreatment.
A variety of preprocessing methods for the spectroscopic data were compared to extract the useful information from noise, such as normalization, baseline, Savitzky-Golay smoothing, Savitzky-Golay smoothing plus first-order derivatives, Savitzky-Golay smoothing plus second-order derivatives, spectroscopic transformation, multiplicative scatter correction, standard normal variate transformation, and wavelet de-nosing of spectra. SIMCA P +11.5 (Umetrics AB, Umea, Sweden) and Unscrambler 9.7 (Camo software, Oslo, Norway) served as chemometric tools for data preprocessing.

Model Building.
In order to build quantitative models, the samples were split into the calibration and validation sets by Kennard-Stone algorithm. In this study, 40 samples were selected as the calibration set, while the remaining 10 samples were kept as the validation set. The whole spectra with wavenumber 10000-4000 cm −1 were used to build models.
PLS regression algorithm performed on Matlab version 7.0 (Mathworks Inc., Natick, Massachusetts, USA) with PLS Toolbox 2.1 (Eigenvector Research Inc., Wenatchee, Washington, USA) was used to set up quantitative models. The number of latent variables was optimized by the leave-oneout cross validation method and predicted residual error sum square (PRESS). The performances of PLS models were evaluated in terms of correlation coefficient for both calibration and validation sets ( cal and pre , resp.), the root mean square error of calibration (RMSEC), the root mean square error of cross validation (RMSECV), the root mean square error of prediction (RMSEP), BIAS for both calibration and validation (BIAS cal and BIAS pre , resp.), and the relative predictive deviation (RPD). PLS model showed good performance with large and RPD values, while small RMSEC, RMSECV, RMSEP, and BIAS values. The equations of these indicators were as follows: where is the number of samples, is the reference value of the sample of number , is the predictive value of the sample of number , is the average value of reference value, and is the average value of predictive value. SD pre is the standard deviation of prediction set data, is the reference value of prediction set, and is the average value of prediction set.
LS-SVM algorithms carried out by LS-SVM lab Toolbox 1.8 (Department of Electrical Engineering, Leuven-Heverlee, Belgium) [28] was also used to set up quantitative models. In order to obtain the LS-SVM model, two extra hyperparameters, gam and sig 2 , need to be tuned by leave-one-out cross validation. Gam is the regularization parameter, determining the tradeoff between the training error minimization and smoothness of the estimated function. Sig 2 is the Gaussian RBF kernel function parameter. The performance of the LS-SVM model was evaluated in terms of chemometric indicators the same as PLS regression.  Table 3, it is obvious that contents of cryptotanshinone and tanshinone IIA varied considerably among different samples. And large quality fluctuations could be observed among the commercial tanshinone extracts produced under the same specifications according to the Ch.P. 2010 [24]. For example, contents of cryptotanshinone in commercial extract powders varied from 2.8 to 88 mg⋅g −1 , with the mean value of 12 mg⋅g −1 and the standard deviation of 28 mg⋅g −1 . And, for tanshinone IIA in commercial extract powders, the contents were from 0.85 to 1.4 × 10 2 mg⋅g −1 with the average value and standard deviation being 25 and 27 mg⋅g −1 , respectively. The possible reasons could be attributed to different sources of Salvia miltiorrhiza, different preparation processes, and various storage conditions.

Analysis of Physical Attributes.
The results of physical attributes tests are shown in Table 4. The relative standard deviations of all physical attributes were smaller than 0.5. It is clear that the variation coverage of physical properties were smaller than the contents of active pharmaceutical ingredients, whose relative standard deviations values were above than 1.0. Generally, the specific surface area of loose porous material is supposed to be large due to plenty of micropores. But values of specific surface area for the homemade and commercial Salvia miltiorrhiza extract powders were all below 0.500 m 2 ⋅g −1 , indicating that these samples were dense with little micropores. If such extract powder was used as raw material for dry granulation or direct tableting, the dense structure might result negatively in the dissolution tests of produced granules or tablets.
Homemade extract powders were treated by grinding, while commercial extract powder was directly from spray drying. The two different preparation methods may lead to the variation of particle size distribution between the two sample sets. 50 values were 15.34∼57.17 m for commercial samples and those were 35.52∼83.33 m for homemade samples, suggesting that spray drying powders were finer than grinding ones. In real pharmaceutical applications, that is, granulation or tableting, fine powders made from spray drying as raw materials are a better choice than coarse powder, since fine powders deserve better uniformity of distribution.
The tapped density values of homemade samples are close to that of commercial samples. Different from the liquid density, solid density is not a unique "band. " Tapped density of herbal extract powders with different chemical compositions and contents may be the same. For example, as seen in Table 5, contents of cryptotanshinone for the first three samples are 3.4, 37, and 33 mg⋅g −1 , and contents of tanshinone IIA are 1.5, 2.5, and 28 mg⋅g −1 , respectively. The three samples  had the same tapped density 0.74 g⋅cm −3 , while the former two samples had similar contents of tanshinone IIA, and the latter two samples had similar contents of cryptotanshinone. In contrast, tapped density of the extract powders with similar chemical compositions may be different. The contents of cryptotanshinone of the last two samples in Table 5 are 4.0 and 4.3 mg⋅g −1 , and the contents of tanshinone IIA of them are 2.0 and 2.9 mg⋅g −1 , indicating that the chemical compositions and contents of these two samples are similar, but the tapped densities are 0.77 and 0.70 g⋅cm −3 , respectively. Similar to tapped density values, values of moisture content did not show much difference. Most of moisture contents were below 5% except for two samples of homemade sample sets. Moisture of extract powders could directly affect the subsequent operations, such as dry granulation and direct compaction. If the moisture content is too high, the storage of extract powders will be a challenge. While, if the moisture content is too low, it will be difficult for direct compaction of tablet. So moisture of tanshinone extract powder should be monitored and controlled within a proper range.
It could be summarized that, for each quality attribute, the values fluctuated within a certain range. The differences of tapped density (the values of relative standard deviation being 0.083 for homemade and 0.096 for commercial) were smaller than other indexes (all values of relative standard deviation being more than 0.15 for the homemade and commercial). Tapped density is macroscopic, while other quality attributes are microscopic, which may lead to the different variation coverage of measured indexes. Figure 1 shows the raw NIR spectra without any pretreatment. In the region of wavelength 7000∼4000 cm −1 , serious peak overlapping and great noise could be observed, suggesting that a great deal of information may be concealed. For different quality attributes, the NIR spectra with different data pretreatment methods were found to bear different capability in both calibration and validation. As shown in Tables 6 and 7, the best preprocessing methods in prediction of the contents of cryptotanshinone and tanshinone IIA are normalization and Savitzky-Golay smoothing plus first-order derivatives, respectively. Normalization could eliminate redundant information and increase the difference among samples. Savitzky-Golay smoothing could clear high 6 International Journal of Analytical Chemistry Raw means using the original spectra without any pretreatment; LVs means numbers of latent factors of the PLS model. cal and pre represent correlation coefficients for calibration and validation sets, respectively. RMSEC, RMSECV, and RMSEP represent the root mean square error of calibration, cross validation, and prediction, respectively. BIAS cal and BIAS pre represent bias for calibration and validation, respectively. RPD means relative predictive deviation. S-G smooth means Savitzky-Golay smoothing; S-T represents spectroscopic transformation; MSC means multiplicative scatter correction; S-G 1st is Savitzky-Golay smoothing plus first-order derivatives for short; S-G 2nd means Savitzky-Golay smoothing plus first-order derivatives; baseline means baseline correction; SNV represents standard normal variate transformation and WDS is wavelet denoise of spectra for short. frequency noise by means of least square polynomial fitting to the data in the moving window. And 1st derivative spectrum could eliminate shift irrelevant to the wavelength. As shown in Table 8, the best preprocessing method for specific surface area, 10 , 50 , and moisture content is Savitzky-Golay smoothing plus first-order derivatives. Figure 2 shows the NIR spectra after Savitzky-Golay smoothing plus first-order derivatives, where the shifted baselines of the raw spectra are corrected. For 90 and tapped density, the best preprocessing methods are spectroscopic transformation and wavelet denoising, respectively. Spectroscopic transformation is often used to switch between absorbance and reflectance data and transform reflectance data into Kubelka-Munk units.

Data Pretreatment.
And wavelet denoising deals with high frequency noise of spectrum.

Calibration and Validation of Quantitative Models.
The calibration results of the content of cryptotanshinone (see Figure 3) demonstrate that 11 latent factors with minimum RMSECV and PRESS values are enough to build the PLS models. The correlation coefficients of calibration and validation sets were 0.9963 and 0.9969, respectively. The values of root mean square error for calibration, cross validation, and prediction were 0.0018, 0.0033, and 0.0013 mg⋅g −1 , respectively. And the RPD value was 8.9. Gam and sig 2 of the LSSVM model for the tapped density are optimized by the standard simplex algorithm and resulted values are 6.9355 and 111.63, respectively (see Figure 4). The correlation coefficients of calibration and validation sets were 0.9851 and 0.8875, respectively. The values of root mean square error of prediction were 0.020 g⋅cm −3 . And the RPD value was 2.2.
Furthermore, the established PLS and LS-SVM models are compared, as shown in Tables 8 and 9. It can be found that PLS models exhibited good performance in prediction of chemical properties, particle size, and moisture content. But, for specific surface area and tapped density, LS-SVM models performed better than PLS models. Take the tapped density for example, the correlation coefficients of calibration and validation sets for the PLS model were 0.8830 and   0.8940, respectively. The root mean square error for calibration, cross validation, and prediction were 0.034, 0.038, and 0.023 g⋅cm −3 , respectively. The RPD value was only 1.9. In contrast, the correlation coefficients of calibration and validation sets for the LS-SVM model were 0.9851 and 0.8875, respectively. The root mean square error of prediction was decreased to 0.020 g⋅cm −3 . And the RPD value was increased to 2.2. For all quality attributes, the performances of LS-SVM models were slightly better than PLS models. As stated in [29], the LS-SVM models could take into account some nonlinearity between the dependent and independent variables, while improved PLS models with the low prediction abilities. That is to say, there may be some nonlinear relationship between the NIR spectra and quality attributes. However, in prediction of the content of cryptotanshinone and tanshinone IIA, particle size, and moisture, PLS models were sufficient, since they are easy to be implemented. While, for physical attributes, such as the specific surface area and tapped density, where the prediction of PLS models did not perform well, LS-SVM model may be a better choice.

Conclusions
In this paper, the chemical and physical quality attributes of tanshinone extract powders are determined simultaneously by near infrared spectroscopy for the first time. The PLS and LS-SVM models are used to build quantitative models. It is found that PLS models exhibit good performance in prediction of the chemical properties, particle size ( 10 , 50 , and 90 ), and moisture content. And the LS-SVM models  are good at predicting the specific surface area and tapped density. Results demonstrated that the massive information concealed in NIR spectra could be analyzed with the help of a combination of process analytical technology tools and chemometric methods. The subsequent process operations, such as blending, granulation, and tableting and even the final products could benefit from the understanding and control of herbal extract powders.