Development of Quantitative Structure-Property RelationshipModels for Self-Emulsifying Drug Delivery System of 2-Aryl Propionic Acid NSAIDs

We developed the quantative structure-property relationships (QSPRs) models to correlate the molecular structures of surfactant, cosurfactant, oil, and drug with the solubility of poorly water-soluble 2-aryl propionic acid nonsteroidal anti-inflammatory drugs (2-APA-NSAIDs) in self-emulsifying drug delivery systems (SEDDSs). The compositions were encoded with electronic, geometrical, topological, and quantum chemical descriptors. To obtain reliable predictions, we used multiple linear regression (MLR) and artificial neural network (ANN) methods for model development. The obtained equations were validated using a test set of 42 formulations and showed a great predictive power, and linear models were found to be better than nonlinear ones. The obtained QSPR models would greatly facilitate fast screening for the optimal formulations of SEDDS at the early stage of drug development and minimize experimental effort.


Introduction
Low water solubility of many drug candidates has been a big challenge to pharmaceutical industry since the oral delivery of these drugs may lead to low bioavailability high intra-and intersubject variability [1].Several formulation approaches to improve solubility of these drugs have been investigated including cyclodextrins [2], micelles [3], nanoparticles [4], solid dispersions [5], and self-emulsifying drug delivery systems (SEDDSs).SEDDS are isotropic mixtures of an oil, surfactant, co-surfactant and drug that form O/W emulsion or microemulsion when introduced into aqueous phases under gentle agitation [6][7][8].They can enhance the oral bioavailability of hydrophobic drugs, which are attractive carriers for poorly water-soluble drugs [8][9][10][11].Dissolution in SEDDS and no precipitation in the gastrointestinal tract are some of the prerequisites for the efficient intestinal absorption of drugs [12].The drug solubility in SEDDS is a key parameter to select optimal formulations [13].
Pharmaceutical preparation is a complicated procedure including preformulation studies, formulation screening, technology optimization, and stability studies.Among them, screening for the optimum formulation is a crucial step.Usually, the first stage is to select suitable excipients and preparation technology through preliminary experiments, and then to screen for the optimized formulation using singlefactor design, orthogonal design, or uniform design.These experimental processes are expensive and time consuming.Therefore, estimating properties using theoretical modeling is an efficient way for formulation screening.Quantitative structure-property relationships (QSPRs) are the process by which chemical structure is quantitatively correlated with its physical, chemical, or biological property.It has been widely used in pharmaceutical research [14][15][16] including predicting the biological activity [17], absorption [18,19], distribution [20,21], metabolism, excretion [22], and chemical reactivity-related toxicity [23] (ADMET) properties of drugs.However, QSPR is rarely applied in the pharmaceutics  Ethyl oleate Tween40 Ethanol [24][25][26] since numerous factors might affect the preparation process.Therefore, it is a good attempt to introduce QSPR into pharmaceutics, establishing the relationship between the property of formulation and the chemical structure of compositions by mathematical methods, which will decrease the experimental time.
The aim of this study was to develop available QSPR models for predicting the drug solubility in SEDDS.We investigated a set of poorly water-soluble 2-aryl propionic acid nonsteroidal anti-inflammatory drugs (2-APA-NSAIDs).We then applied the model such obtained to understand the solubility mechanism of drug in SEDDS as well as to fast screen for the optimized formulations.

Solubility Studies.
In the study, 0.1 g self-emulsifying mixture was diluted with distilled water to 5 ml in a sealed tube and gently mixed by a Vortex mixture (Ika, Germany).An excess amount of drugs was added to the formed microemulsions or emulsions.The blend was mixed and left to equilibrate at 37 • C for 48 h in a water bath and then centrifuged at 6,000 rpm for 10 min.The supernatant was

HPLC Analysis of the Model Drugs.
The HPLC analysis was performed with a Waters pump 515 and a UV-VIS detector 2487.The column was a Diamosil C18 100 mm × 4.6 mm column (Dikama, China).The mobile phase consisted of a mixture of methanol, water, and phosphoric acid (20 : 80 : 0.1, v/v/v).The UV detector wavelengths were set at 254 nm (ketoprofen), 222 nm (ibuprofen), 247 nm (flurbiprofen), 273 nm (naproxen), 222 nm (loxoprofen), 300 nm (carprofen), respectively.The elution was carried out at a flow rate of 1.0 mL/min, and the temperature of column oven (PH-730A, Phenomen, China) was set to 30 • C. Each measurement was repeated for three times.

Descriptor Generation and Variable Selection.
Molecular descriptors are commonly used to represent the structural and physicochemical features of compositions, so that they can be used in a QSPR model.Thus, to establish a QSPR model, Ab initio quantum mechanical calculations were first performed for relevant molecular descriptors using Gaussian 03 software package (Gaussian 03, Gaussian, Inc., Pittsburgh, 2003.).Geometric optimization and quantum chemical, electrostatic parameters were calculated at RHF/6-31G * level.Quantum chemical parameters including the dipole moment (Dipole), the energy of the highest occupied molecular orbital (E HOMO ), and the lowest unoccupied molecular orbital (E LUMO ) as well as electrostatic parameters including MaxQ − , MaxQ + , ABSQ, and ABSQon were obtained.
In addition, Discovery Studio 1.7 package (Accelrys Inc., USA) was used to calculate parameters such as molecular volume, polar surface area, wiener index, logD, and logP.Constitutional parameters including surfactant ratio (SR), cosurfactant ratio (CoSR), and oil ratio (OR) were also calculated.Table 2 shows the values of important descriptors.Nonionic surfactants, Tween20, Tween40, and Tween80 belong to the polyoxyethylene sorbitan family.They have similar head structures, and the difference observed in behavior is mainly due to different hydrophobic portions [27].So each surfactant structure was cleaved into two parts: the same hydrophilic segment (HS) and a different lipophilic segment (LS); and their descriptors were calculated separately.The cleavage method was performed as in Taha et al. [26].
The role of cosurfactant in the formation of SEDDS is to increase the interfacial flexibility by extending into the surfactant interfacial monolayer and consequently creating void space among the surfactant molecules [13].Both surfactant and cosurfactant in SEDDS are used to reduce the interfacial tension.So for simplification purpose, we combined the descriptors of surfactant and cosurfactant together.The overall descriptor was calculated as follows: Descriptor of Smix = Rs × Ds + Rcos × Dcos, (1) where Rs is the ratio (w/w) of surfactant; Ds is the molecular descriptor of lipophilic segment of surfactant.Rcos is the ratio (w/w) of cosurfactant; Dcos is the molecular descriptor of cosurfactant.The descriptors were selected to make a stable and interpretable model.A three-stage manual descriptor selection process was performed: (1) descriptors with too many zero values or the same values (descriptors of Tween HS) were eliminated; (2) descriptors with very small standard deviation values (<0.5%) were removed; (3) a particular descriptor was chosen to represent a group of highly correlated variables (correlation coefficients >0.80), thereby minimizing the redundancy and overlapping of the descriptors.Since the ranges of descriptor values influence the quality of the models generated, we normalized the rest descriptor values to a range of 0 to 1 [28].

QSPR Modeling.
To begin the model development process, the solubility data of drugs in formula 1-6 were split into a training set (80% of the total number of formulations) and an internal validation set (20% of the total number of formulations) randomly.The solubility data of drugs in formula 7-8 were used as a predicting set.The selected descriptors in Section 2.3 were regressed against the solubility of the training set by means of multiple linear regression (MLR).The best equations were determined based on the highest squared multiple correlation coefficient (R 2 ), Fisher ration (F), and lowest standard error (s).
Artificial neural network (ANN) is a proper method for modeling nonlinear relationship [29].It was also attempted to develop the better predictive models.All networks used in this study were three-layered back-propagation (BP) type.The input data included the descriptors selected in linear models, and the output neuron referred to the solubility values of drugs in SEDDS.Sigmoid transfer functions were used in all layers.The number of neurons in the hidden layer was adjusted to optimize the network, and the best model gave the highest correlation coefficient (r) and the lowest MSE.The internal validation set (18 formulations) was used to prevent the overfitting.

Statistical Analysis.
To evaluate the predictive ability of QSPR models, the statistical parameters of mean square error (MSE), root mean square error of prediction (RMSEP), the RMSE, the relative standard error of prediction (RSEP), and mean absolute error (MAE) [30] were used.Table 3 shows these equations.

Results and Discussion
3.1.QSPR Models.Table 4 shows the solubility of 2-APA-NSAIDs in various formulations.In all the equations, variable inflation factor (VIF) was less than 10, suggesting the absence of multicollinearity.As  shown in Table 5, the correlation matrix for these descriptors shows no high correlation between variables and could be used to develop QSPR models.The statistical results indicate that these equations represent good models for calculating the solubility (Table 6).Models in (2) shows the significance of the combination of SR, OR, O-MaxQ − , O-ABSQ, O-E LUMO , S-Volume, and S-Dipole in the solubility of drugs in SEDDS.According to t-test criterion, the most important descriptor is SR.The positive coefficient suggests that high-concentration surfactant will increase the solubility.Surfactant plays an impor-tant role in O/W microemulsion/emulsion formation: it forms a layer around emulsion droplets, which reduces the interfacial energy and provides a mechanical barrier to coalescence [31].And the result suggests that drugs are mainly dissolved in the phase of surfactant.
The specific effect of O-MaxQ − , O-ABSQ, and S-Dipole to the solubility depends on the drug type.

Figure 1 :
Figure 1: Molecular structures of model drugs.

2. 4 .
QSPR Modeling.ANN models were constructed with the same descriptors as in MLR models using Leavenberg-Marquardt (LM) algorithm as activity function.The proper

Table 2 :
Values of important descriptors.
filtered through a filter membrane (0.22 μm), diluted with methanol to a suitable concentration range, and quantified by HPLC (see Section 2.2.3).

Table 3 :
Equations of statistical parameters.pred and y exp are predicted and experimental solubility values, respectively; n is the number of samples in the data set. y

Table 5 :
Correlation matrix for selected descriptors.

Table 7 :
Experimental and predicted values of predicting set.