Low-Content Quantitation in Entecavir Tablets Using 1064 nm Raman Spectroscopy

,e nondestructive and high sensitive analysis of a low content of an active pharmaceutical ingredient (API) was a difficult problem, especially in a complex system of pharmaceutical formulations. In this paper, a rapid and no sample preparation method was developed, which used a 1064 nm Raman spectrometer to detect entecavir monohydrate (ETV-H) in Baraclude tablets. Entecavir was a drug approved by FDA for the treatment of chronic hepatitis B and became the first choice in the market. ,e wavelength selection results displayed that the signal-to-background ratio of the Raman spectrum with 1064 nm excitation wavelength was 14 times that of the commonly used 785 nm wavelength. ,e partial least squares (PLS) method was used to calibrate concentrationmodels containing 0.1% to 1.0%w/w% ETV-H in calibration set samples. Different preprocessingmethods were used to eliminate the background interference and extract more spectral information. Calibration samples were used to choose the best performing model. ,en, all the calibration samples combined with the best performing models’ parameters successfully predicted the content of ETV-H in Baraclude tablets. Combining baseline processing and standard normal variate (SNV) with PLS, the model showed a good result with an R of 0.973, RMSEC of 0.05%, and RMSEP of 0.03% on the spectral region of 1350–1700 cm.,e limit of detection of this model was 0.17%.,ese results showed that 1064 nm Raman spectroscopy technology could be an alternative analytical procedure to quantify low-content API in intact tablets.


Introduction
In 2005, Baraclude (entecavir) tablets were approved by the FDA for the treatment of chronic hepatitis B, which had infected more than 400 million people in the world [1]. In the studies of FDA, people treated with Baraclude showed significant improvement in the liver inflammation and liver scarring caused by HBV. So, entecavir (ETV) had been the first choice for hepatitis B treatment in the market [2]. ETV was a carbocyclic 2′-deoxyguanosine analogue, which could phosphorylate into the form of triphosphate that can inhibit HBV in active cells. e chemical name for ETV was 2amino-1, 9-dihydro-9-((1S, 3R, 4S)-4-hydroxy-3-(hydroxymethyl)-2-methylenecyclopentyl)-6H-purin-6-one. ETV monohydrate (ETV-H) was the API of commercial ETV pharmaceutical tablets.
In the pharmaceutical industries, pharmaceutical quality control (QC) and quality assurance (QA) were necessary at all stages of product processing from a raw material to a packaged product. In the pharmaceutical production process, the final drug product might contain an insufficient or excessive content of API [2,3]. So, QC of the active pharmaceutical ingredients (APIs) content in pharmaceutical tablets was necessary to ensure the safety and efficacy of the drug products [4][5][6]. With respect to low content API formulations, FDA described that the API in the formulations was less than 1% [7]. e possible problem for lowcontent API formulations was the low relative potency of the final product due to its vulnerability to loss and contamination in the production process [8]. So, it was more significant for the QC of low-content API formulations. Traditional QC methods such as High-Performance Liquid Chromatography (HPLC), solid state NMR spectroscopy, Mass Spectroscopy (MS), and X-ray powder diffraction were time consuming, destructive, expensive, and required lengthy sample preparation [9][10][11][12][13]. Motivated by the process analytical technology (PAT) initiative, QC techniques were being developed to provide real-time process control for better understanding of the chemical and physical processes during the production process. Vibrational spectroscopic techniques such as Near Infrared spectroscopy (NIRS) and Raman spectroscopy emerged as valuable tools for pharmaceutical quality analysis because of their fast analysis speed, no need for sample preparation, and no damage to the sample [14][15][16][17]. Due to the higher resolution than NIR, Raman spectroscopy became the more promising QC technique in the pharmaceutical industry [18][19][20]. Griffen et al. reported quantification of the low level polymorph content (0.62-1.32%) in tablets by transmission Raman spectroscopy [21]. Li et al. had demonstrated low-content (<0.1%) quantification in powders using 785 nm Raman spectroscopy [22]. Assi et al. studied the application of a handheld Raman spectrometer for the quantification of ciprofloxacin in proprietary ciproxin tablets and generic ciprofloxacin tablets [23]. Daniela et al. studied the potential of Raman spectroscopy as process analytical technology (PAT) for the in-line and real-time monitoring of the powder blending process and proved that Raman spectroscopy was effective in the determination of API in the tablets. ey also found that Raman performed better than the traditional HPLC analysis [24]. Liljana et al. compared three technologies including midinfrared, near-infrared, and Raman spectroscopy for the quantitative analysis of low API in blending powders and found that Raman presented the most favorable statistical indicators in this comparative study [25].
Since the Raman signal was proportional to the concentration of the analyte in the sample, the application of univariate regression analysis (peak height/peak area) was theoretically feasible [26,27]. However, in the quantitative analysis of most tablets, useful single peaks were difficult to find due to the interference of other components in the tablet. So, the application of more complex multivariate methods such as PLS (partial least-squares) regression and PCA (principal components analysis) was required. e multivariate methods could handle thousands of variables to improve the accuracy of the methods, since not only the intensity or area of the selected bands but also the variation over the entire spectral range were considered. erefore, Raman spectroscopy combined with complex multivariate algorithms provided a promising basis for the development of successful quantitative models, even for fairly complex mixtures or low content of the tested components. Farias et al. applied Raman spectroscopy combined with PLS to determine and quantify crystalline forms of the API in final products as obtained in the production lines [28]. Maxwell et al. reported the development of chemometric models (PLS and PCR) of Raman spectroscopy to determine the polymorphic changes of theophylline in pharmaceutical products [29].
Before building quantitative models of APIs in tablets, it was necessary to pretreat the data to remove the interference information in Raman spectra. Baseline correction was commonly used in Raman spectroscopy to eliminate fluorescent contaminants and instrumental factors. ree commonly used preprocessing methods were multiplicative scatter correction (MSC), standard normal variate (SNV), and Savitzky-Golay derivatives (SG derivatives). e purpose of MSC was to remove scattering artifacts and that of SNV was to remove the scattering variations between measurements [30,31]. SG derivatives were used for the purpose of removing noise and background variances [32].
In the present study, Raman spectroscopy with an excitation wavelength of 1064 nm was used to quantify the lowcontent ETV-H in tablets for the first time. Calibration samples with different concentrations of ETV-H in the dosage forms were prepared and determined by Raman spectroscopy. Pretreated methods (baseline correction, MSC, SNV, and Savitzky-Golay first and second derivatives) were performed for the data. en, the data were used to build the quantitative model to detect the low-content ETV-H in Baraclude tablets in the market.

Preparation of Calibration and Test Set Samples.
As the commercially available Baraclude tablets only had an API content of 0.25% w/w%, the calibration set was developed as with an ETV-H mass concentration of 0.1%-1.0% w/w%. e content of each component in the calibration set samples is shown in Table 1. All the components were sieved through a 400-mesh sieve (with an aperture of 38 μm) for homogeneity and weighed according to the quantities listed in the table. In order to mix absolutely, the total weight of each concentration sample was increased to 2000 mg. Tablets were prepared by mixing the API and all excipients using an MX-S vortex oscillator. All concentration samples were divided into ten tablets, and each tablet (200 mg) was pressed by using a YP-15 manual powder compactor (Josvok technology co. LTD, Tianjin, China) with a 10 mm die set. e compression force was 25 MPa with a dwell time of 1 min. Each concentration was tested by Raman spectroscopy using three tablets. e test set contained two different specifications of Baraclude tablets (0.5 mg and 1.0 mg ETV-H in tablet, the concentration of ETV-H both was 0.25% w/w%) and homemade tablets with an ETV-H mass concentration of 0.5% w/w%. Each test sample was determined by Raman spectroscopy using two tablets.

Instrumentation and Software.
e Raman spectra were collected by using a Rigaku Progeny handheld Raman spectrometer (Rigaku Co., Tokyo, Japan) with a 1064 nm high-power excitation laser. e instrument could give a maximum laser power of 490 mW at the source. e actual laser power reaching the sample was 142 mW. e focused spot diameter was 25 μm. Raman spectra were recorded in the wave number range 200-2500 cm −1 at a resolution of 8-11 cm −1 with transmission volume phase gratings.
A LabRAM HR Evolution Raman spectrometer (Horiba Jobin Yvon Inc.) with 532 nm, 633 nm, and 785 nm excitation wavelengths was also used to obtain the Raman spectra of ETV-H. e laser output power was set to 10 mW (maximum output power), and the integration time was set to 10s over the Raman shift range of 200 cm −1 to 4000 cm −1 .
Multivariate data analysis was carried out using Matlab R2017b (Mathworks Inc., MA, USA).

Spectral Pretreatments and Chemometrics.
In order to establish better analysis and prediction methods, it was necessary to select the wave number range that was favorable for predicting analytes and eliminating noise before quantitative analysis. is study chose the correlation coefficient method to select the wave number range. e correlation coefficient method was realized by calculating the correlation between the ETV-H concentration in tablets and wave numbers. e correlation coefficient is calculated by where j represented the jth wave number and i represented the ith sample, x i,j was the Raman intensity of the jth wave number of the ith sample, x j was the mean of the jth wave number, y was the mean reference value for all samples, and y i was the reference value of the ith sample.
After selecting the wave number range, Raman data should be pretreated by preprocessing methods.
Baseline correction: this method selected the polynomial fitting method for the pretreatment of the spectra. Savitzky-Golay first and second derivatives, standard normal variant (SNV), and multiplicative scattering correction (MSC) were performed before quantitative analysis. e first and second Savitzky-Golay derivatives were performed with a window size of 10 points and a second-order polynomial. e calibration set was validated by dividing the samples randomly into two sets (one set was the calibration set and another set was the validation set).
Partial least squares (PLS) were used for quantitative modeling. To avoid the overfitting of the model, the most suitable PLS model should have a low number of PLS latent variables (factors) and low values of three parameters, which were the root-mean-square error of cross validation (RMSECV), root-mean-square error of calibration (RMSEC), and root-mean-square error of prediction (RMSEP). RMSE was defined as the following equation: where x i is the measurement value, y i is the prediction value, and n is the number of samples [33]. Limit of detection (LOD) of the calibration model was defined as follows [21]: where σ is the standard deviation of the regression fit and S is the slope of the calibration curve.

e Determination and Selection of Different Raman Wavelengths.
e ETV-H solid powder was measured by different excitation wavelengths of Raman. As shown in Figure 1, the Raman spectrum obtained from 1064 nm excitation wavelength got higher signal-to-background (SBR) and lower background fluorescence interference than the 785 nm, 633 nm, and 532 nm Raman spectra. For the 532 nm excitation wavelength spectrum, the SBR ratio (Signal/ Background, S/B) at 1487 cm −1 (the highest peak in the spectrum) was 0.1. Also, the S/B was 1.0 for 633 nm, 1.9 for 785 nm, and 27.3 for 1064 nm. Obviously, the S/B of the spectrum with an excitation wavelength of 1064 nm was significantly higher than several other wavelengths. S/B of the Raman spectrum with 1064 nm excitation wavelength was 14 times that of the commonly used 785 nm wavelength.

Chemometric Models for Quantitation of Baraclude
Tablets. Calibration set samples (21 samples) with different API mass concentrations (w/w%) were measured by Raman spectroscopy at 1064 nm excitation wavelength. e coating layer of Baraclude tablets (the ingredient was Opadry ® ) was carefully scraped off with a special blade. After the coating layer was removed, Baraclude tablets (0.5 mg and 1.0 mg ETV-H in tablet, the concentration of ETV-H was both 0.25% w/w%) were measured to obtain Raman spectra of the test set. All Raman spectra of the samples are shown in Figure 2. With the increase of the concentration of ETV-H in tablets, Raman intensity changed slightly at the peak of 1580 cm −1 (characteristic Raman peak of ETV-H). So, it was necessary to select the wave number range instead of the single peak.
Raman spectra of ETV-H and all excipients are presented in Figure 3. e assignments of all Raman characteristic peaks of ETV-H and all excipients are listed in Table 2 It could be concluded that ETV-H differed from all the excipients in the spectra region of 1350-1700 cm −1 .
is meant that API could be distinguished in this region. e correlation coefficient method was also used to select the characteristic wavelength, and the results showed that the wave number range of 1350-1700 cm −1 could be used as the feature range. After selecting the characteristic wavelength range, the number of wave numbers used for the quantitative model could be reduced from 512 to 72. e data used to support the findings of this study are available from the corresponding author upon request. e blue dotted frame represents the selected spectral range of 1350-1700 cm −1 .
Partial least squares (PLS) method was used to calibrate concentration models in 21 calibration set samples containing 0.1%, 0.2%, 0.3%, 0.4%, 0.6%, 0.8%, and 1.0% w/w% ETV-H. e optimum number of PLS latent variables (LVs) was chosen by comparing the root mean squared error of cross validation (RMSECV, leave-one-out validation) to ensure that most of the variations were included [34]. As shown in Figure 4(a), three PLS components were selected based on the minimal RMSECV values obtained by combining baseline correction and SNV for the pretreatment of the samples within the wave number range of 1350-1700 cm −1 .
is methodology avoided overfitting of the model when excess potential variables were selected.
A latent variables loadings plot of the model (three PLS components, baseline correction, and SNV for the pretreatment of the samples within the wave number range of 1350-1700 cm −1 ) is shown in Figure 4 e total scores of the three variables reached 99%. All the PLS components for other models were chosen for the same method. en, the total 21 calibration set samples were divided into a calibration set (14 samples) and a validation set (7 samples). Combining different preprocessing methods and spectral ranges, we obtained 16 models (such as the model with baseline correction as the pretreatment method in the full spectral range). As shown in Table 3, PLS LVs of all the models were calculated by the method mentioned above. e accuracy of the PLS calibration model was evaluated by assessing the correlation coefficient (R 2 ) and RMSEP. So, the RMSECV, R 2 , RMSEC, and RMSEP values of all the models were calculated by using the corresponding LV value. e best performing model was marked in italics in Table 3, which was the model combining the baseline correction with     Journal of Spectroscopy the SNV method for the data pretreatment in the wave number range of 1350-1700 cm −1 . As shown in Figure 5, the squared correlation coefficient of the best model was 0.970, RMSEC was 0.05%, and RMSEP was 0.05%. ese results represented the reliability and accuracy of the model. en, all the calibration samples (21 samples) were used to build PLS models using the best performing model parameters, and the ETV-H content in Baraclude tablets and homemade tablets was successfully predicted. e best quantitative model was obtained from the spectral region  Journal of Spectroscopy 1350-1700 cm −1 using baseline correction and SNV as the preprocessing method with the results of an R 2 of 0.973, RMSEC of 0.05%, and RMSEP of 0.03% ( Figure 6). As shown in Figure 6, due to the content uniformity margin of the tablets in which the API content would fluctuate, the surface of the prepared tablets was not uniform despite the long mixing process. Raman spectroscopy could not overcome the problem of uneven surface content, so the exact ETV-H content in each tablet could not be determined by Raman spectroscopy. erefore, for all samples, only approximate values for the real ETV-H concentration were used for the quantitation model. e error of the approximate value plus the error of the measured value would produce error transmission and affected the statistical indicators of the models R 2 , RMSEC, and RMSEP.
It could be seen that the original data could be used to quantify ETV-H in Baraclude tablets after baseline  Journal of Spectroscopy processing and SNV pretreatment. In addition, the prediction results of the models were not bad when using SNV pretreatment only. From this point of view, SNV performed better than other pretreatment methods for Raman spectral data. According to equation (3), the σ value of the LOD equation referred to RMSEC of the model, and S referred to slope of the calibration curve. So, the LOD of the best performance model was 0.17%. According to US Pharmacopeia guidelines, the acceptable percentage of the labeled amount of ETV-H in the drug content ranged from 90.0% to 105.0%. As shown in Table 4, the predicted ETV-H contents in commercial tablets were all within the above range.

Conclusions
Raman spectroscopy with 1064 nm excitation wavelength was successfully employed as an analytical tool for the nondestructive, and no sample preparation required determination of low content ETV-H in Baraclude tablets. By applying different preprocessing methods (baseline correction, SNV, MSC, and Savitzky-Golay first and second derivatives), PLS quantitative models were built to predict the concentration of ETV-H in Baraclude tablets. e calibration samples were divided into two sets, and the best performing model was chosen. en, the best chemometrics model using all the calibration samples to build the model and predict the ETV-H content in the test set samples. It showed a good result with an R 2 of 0.973, RMSEC of 0.05%, and RMSEP of 0.03% on the spectral region 1350-1700 cm −1 with baseline processing and SNV as preprocessing methods for the raw data. e LOD of the best performance model was 0.17% w/w%. e predicted ETV-H contents in Baraclude tablets were all in the range defined in US Pharmacopeia.
In addition to the backscattering mode, Raman technology also had other two modes, transmission mode Raman and spatially offset Raman spectroscopy. Transmission mode Raman spectroscopy had the ability to penetrate the entire tablet to obtain the whole information which included API concentration and the spectra of all ingredients. Spatially offset Raman spectroscopy had the ability to obtain deep feature information inside the sample through nontransparent packaging or surface. In comparison, backscattering Raman could only obtain information on the sample surface, which would be insufficient when analyzing low-content API tablets. However, compared with the backscattering Raman, spatially offset Raman and transmission Raman spectroscopy both require larger laser power, which might affect the measured sample. So, this study chose backscattering Raman spectroscopy for the quantification of low-content API tablets. e results showed that 1064 nm Raman spectroscopy had the capability to predict the low-content API in pharmaceutical tablets in the market.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare no conflicts of interest.

Authors' Contributions
Yanlei Kang and Yushan Zhou contributed equally to this work. Table 4: e actual and predicted API content and the percentage of predicted API in the drug content calculated by the best performance model.