Numerous regression approaches to isotherm parameters estimation appear in the literature. The real insight into the proper modeling pattern can be achieved only by testing methods on a very big number of cases. Experimentally, it cannot be done in a reasonable time, so the Monte Carlo simulation method was applied. The objective of this paper is to introduce and compare numerical approaches that involve different levels of knowledge about the noise structure of the analytical method used for initial and equilibrium concentration determination. Six levels of homoscedastic noise and five types of heteroscedastic noise precision models were considered. Performance of the methods was statistically evaluated based on median percentage error and mean absolute relative error in parameter estimates. The present study showed a clear distinction between two cases. When equilibrium experiments are performed only once, for the homoscedastic case, the winning error function is ordinary least squares, while for the case of heteroscedastic noise the use of orthogonal distance regression or Margart’s percent standard deviation is suggested. It was found that in case when experiments are repeated three times the simple method of weighted least squares performed as well as more complicated orthogonal distance regression method.
Adsorption is a mass transfer process that plays a central role in potable water purification, wastewater treatment, both analytical and preparative chromatograph, and different types of chemical analyses as a technique for sample preconcentration and speciation of analytes. The predominant scientific basis for sorbent selection and design of an adsorption system is the knowledge about equilibrium partitioning between two phases often expressed in the form of adsorption isotherm. Based on the isotherms, the following important factors can be estimated: capacity of the sorbent, the method of sorbent regeneration, and the product purities [
A frequently applied method for determining single solute adsorption isotherms is the conventional batch method based on mixing known amounts of adsorbent with solutions of various initial concentrations (
The most commonly used empirical adsorption isotherm models are the Langmuir and Freundlich isotherms [
However, ElKhaiary noticed that both dependent and independent variables used for construction of isotherm equations are affected by experimental errors and first used the method known as orthogonal distance regression (ODR) for the isotherm parameter estimation [
Having so many options open, the researcher has to decide which one to apply. The paper of ElKhaiary and Malash [
Valuable information can be obtained when laboratory experiments are simulated through extensive Monte Carlo calculations. This technique allows for both complete specification and absolute control of all relevant parameters, a condition that real experiments never approximate well. An advantage of Monte Carlo simulations is that they can be repeated thousands of times in a reasonable time and at very low cost.
This study was performed with the aim to answer the question which modeling approach should be applied in particular case. A few aspects of the problem were addressed. Do the isotherm equation type and number of parameters make the difference? How do the properties of the analytical method for the initial and equilibrium concentrations determination affect the parameter estimation procedure? What is the preferred method if one has some information about the measurement error structure? And what is the winning method in the case when the only available information is the isotherm dataset that consists of 5–10 points, with no replication?
The Monte Carlo technique was used as a tool to test the differences between nonlinear and orthogonal distance regression methods. Tendencies within modeling approaches were revealed on a large number of generated datasets, allowing the precision and accuracy of parameter estimates to be determined by comparison with true parameter values. Five isotherm models in the presence of five noise precision models (NPMs) were analyzed by eight modeling approaches. Three levels of reality were distinguished—theoretical level at the one side, when the noise structure is exactly known, and the two experimental levels at the other side: one in the absence of data about noise structure and the second when the estimates of standard deviations could be obtained.
As a result of this investigation, a clear strategy for data reduction in the field of adsorption is presented.
Over the years, a wide variety of equilibrium isotherm models have been formulated. In general, an adsorption isotherm is the relationship between quantity of the component retained on a solid phase
From a mathematical point of view, isotherm equations can be grouped into rational, power, and transcendental functions [
Adsorption isotherm models.
No.  Type  Type of function  Nonlinear form  Linear form  True parameters  Reference  





1  Langmuir  Rational 




/  [ 
2  Freundlich  Power 




/  [ 
3  Jovanovic  exp* 

/ 


/  [ 
4  RedlichPeterson  Rational 

/ 



[ 
5  Sips  Rational 

/ 



[ 
They were chosen to be widely used and to represent different types of mathematical functions (Langmuir, RedlichPeterson, and Sips isotherms are rational functions, Freundlich isotherm is a power function, and Jovanovic isotherm is a transcendental function) and different number of parameters (Langmuir, Freundlich, and Jovanovic are twoparameter isotherms, and RedlichPeterson and Sips are threeparameter isotherms). To avoid unnecessary repetitions, detailed characteristics of the isotherms are not presented. Additional information can be found in the literature.
Let independent data pairs
Assume the smooth function
Closeness averaged over the entire data set is often measured by the sum of the squares of the individual distances. Any point
In the method of OLS the observations are assumed to be homoscedastic and all of the points are assigned equal weights
Ideally, observation weights should be estimated according to individual estimates of measurement error such that
When individual error estimates are unavailable, other empirical weights may provide a simple approximation of standard deviation. For the peculiar case of heteroscedasticity important in many analytical methods, relative standard deviations are reasonably constant over a considerable dynamic range. Thus,
However, the error structure in real data usually lies somewhere on a continuous between a constant absolute error (homoscedastic) at one extreme and a constant percentage error at the other. Between these two there is an error for which the standard deviation is proportional to the square root of the expected value:
Types of weights.
Type of weights  Expression 

Absolute weights  1 
Poisson weights 

Assumption of constant percentage error 

Instrumental weights 

ISOFIT, a software package for fitting sorption isotherms to experimental data by weighted least squares, supports three alternatives: uniform weighting, sorbed relative where weights are inversely proportional to sorbed concentrations, and solute relative where weights are inversely proportional to measured solute concentrations [
In a more general situation, considerable errors can occur in both variables. It is stated that if the errors in
Let the considerable error be also present in the measurements of the independent variable
Again, the model will not fit the observed data points
The values
A reasonable way to estimate the unknown parameters in this case is to minimize the weighted sum of squares of all errors by minimizing the functional:
on the set
Commonly, (
This approach is known as errors in variables or orthogonal distance regression or total least squares. Condition
In orthogonal distance regression analysis of sorption data, units of the variables on the axes are not the same. It is necessary to introduce weights as constants selected to scale each type of variable
In Figure
Geometric illustration of differences among different regression methods: (a) OLS without weighting, (b) orthogonal distance regression without weighting, and (c) orthogonal distance regression with weighting.
Geometrically, if the data pairs
In this case, the radii of these circles are equal to distances between the points
Geometrical representation of a case when
Orthogonal distance regression methods have been used in the fields of science such as economy [
Numerical experiments were designed to be as close as possible representation of a typical experimental setup in adsorption studies. It was adopted that batch experiments are performed in laboratory bakers containing mass of sorbent
The true equilibrium concentration is then calculated solving the equation:
It is assumed that simple univariate chemical measurement system with additive, zero mean, white Gaussian measurement noise is used as an analytical tool to determine
The rest of the procedure was identical as if the experiments were performed in laboratory. The equilibrium sorbent loading was calculated from (
Flow chart of laboratory experiment and a matching numerical experiment is presented in Figure
Flow chart illustrating the steps in adsorption equilibrium experiment and a matching numerical experiment.
Since a wide variety of substances (toxic metals, organic pollutants, etc.) are in focus of adsorption research community, also a wide variety of analytical techniques are used for initial and equilibrium concentrations determination. Accordingly, the measurement errors they introduce differ in type and magnitude. There are different mathematical models, named noise precision models (NPMs), that have been proposed to estimate the change of analytical precision as a function of the analyte concentration. List of such models for specific analytical methods, together with explanations of error sources, can be found in literature [
Noise precision models.
No.  NPM  Expression  Type 

1  H1 

H* 
2  H2 

H 
3  H3 

H 
4  H4 

H 
5  H5 

H 
6  H6 

H 
7  Lin 

Het** 
8  Quad 

Het 
9  HS 

Het 
10  RSD5% 

Het 
11  RSD10% 

Het 
Although there are some experiments where it is reasonable to assume that one variable (
Since this study is based on simulated data, population standard deviations of measurement errors are known and (
for the weights on the
For the
where
Final formulation of the TODR error function which is minimized in case of theoretical fitting of adsorption data is presented as
Although TODR cannot be used outside the theoretical domain, it was included in this study to serve as a golden standard. It is expected to represent the best possible results that can be achieved with the certain observations in possession.
A typical isotherm data set in the experimental domain consists of 5–10 points. Very often, researches perform their experiments in triplicate [
Data obtained in only one numerical experiment (matching the case when the laboratory experiments are performed with no replication) were fitted by the use of four error functions: OLS, ODR, MPSD, and HYBRD. Their expressions are presented in Table
Definitions of error functions in cases when there are no replicated concentration measurements.
Name  Abbreviation  Domain  Expression 

Theoretical orthogonal distance regression  TODR  Theoretical 

Ordinary least squares  OLS  Experimental 

Orthogonal distance regression  ODR  Experimental 

Marquardt’s percent standard deviation  MPSD  Experimental 

Hybrid fractional error function  HYBRD  Experimental 

Formulae of the error functions in Table
OLS, MPSD, and HYBRD are basically least squares methods with different types of weights included. OLS is the approach with all weights equal to one. In case of MPSD, assumption of constant percentage error is accepted and weighting by the equilibrium loading is applied. For the HYBRD error functions weights are of the Poisson type. ODR abbreviation in this context is used for the orthogonal distance regression analog of the MPSD. Assumption of constant percentage error is accepted for both of the axes, and the weights are
The second group of calculations matched the case when laboratory experiments are performed in triplicate. Means of equilibrium sorbent loading (
Definitions of error functions for replicated measurements.
Name  Abbreviation  Domain  Expression 

Experimental weighted orthogonal distance regression  E3WODR  Experimental 

Weighted least squares  WLS  Experimental 

Triplicate orthogonal distance regression  E3ODR  Experimental 

It is important to say that E3WODR is the experimental realization of TODR. Estimates of standard deviation of the variables on the
Weighting in the E3ODR method is based on the mean values of equilibrium sorbent loading and equilibrium sorbate concentration:
For the WLS method, instrumental weights are calculated based on (
The present work was carried out using Windowsbased PC with hardware configuration containing the dual processor AMD Athlon M320 (2.1 GHz each) and with 3 GB RAM. All calculations were performed using Matlab R2007b. Perturbations were generated using Mersenne Twister random number generator. For the purpose of fitting, builtin Matlab function fminsearch was used for OLS, MPSD, HYBRD, and WLS [
It is important to note that one complete numerical experiment and all associated computations were performed for each simulation step, for 2000 steps per combination (one type of isotherm and one type of NPM). The reason simulations were chosen to have 2000 steps was the compromise between the aim to have resulting histograms of quality high enough to facilitate quantitative comparison with theory and to prevent the process from lasting unacceptably long. Few of the simulations took longer than 12 hours, with the most averaging around 8 hours in length. It could be noticed that simulations with higher values of noise standard deviation in general lasted longer. The explanation is the rise in the number of function evaluations and the number of iterations before the convergence is achieved.
This study included 55 numerical simulations on the whole (each of the five isotherms presented in Table
where
The normal probability plots were used to graphically assess whether the obtained parameter estimates could come from a normal distribution. Inspection of such plots showed that in general they are not linear. Distributions other than normal introduce curvature, so it was concluded that nonnormal distribution is involved. One representative example is presented in Figure
Normal probability plot for the parameters of RedlichPeterson isotherm and OLS processing.
Thus, median of percentage error (mE) was used as a measure of accuracy of the method based on a particular error function, and mean absolute relative error (MARE),
Comparison of the methods was done separately for the two following groups of data: observations with no replications and data from triplicate experiments.
Properties of different modeling approaches in case when the experiments are performed once are presented in Figures
Properties of different modeling approaches for determination of parameters in Langmuir isotherm (experiments performed once): (a) mE and (b) MARE.
Properties of different modeling approaches for determination of parameters in Freundlich isotherm (experiments performed once): (a) mE and (b) MARE.
Properties of different modeling approaches for determination of parameters in Jovanovic isotherm (experiments performed once): (a) mE and (b) MARE.
Properties of different modeling approaches for determination of parameters in RedlichPeterson isotherm (experiments performed once): (a) mE and (b) MARE.
Properties of different modeling approaches for determination of parameters in Sips isotherm (experiments performed once): (a) mE and (b) MARE.
Properties of different modeling approaches in case when the experiments are performed three times are presented in Figures A.1–A.5 as in Supplementary Material available online at
Due to a huge quantity of results obtained in this study, some rules had to be put on what is going to be presented in figures. For every type of isotherm, the figures are organized to have two sections: one where mE values are presented (figures labeled (a)) and the other where MARE values are presented (figures labeled (b)). Each section of the plot contains 7 subplots. In one subplot, the results of the applied methods for one NPM are summarized. Trends were noticed and discussed based on the six levels of homoscedastic noise, but in order to make figures compact just two out of six NPMs (H2 as an example of low noise, and H5 as an example of high noise) were presented as the first two subplots. The next five (3–7) subplots were reserved for heteroscedastic NPMs. An additional remark is valid for all the figures in the following paragraph: it was not possible to use the same scale in all subplots, due to large differences in the magnitudes of the outcomes from subplot to subplot. Nevertheless, it does not introduce any problem because the comparison of methods is done in frames of a subplot, and cross comparisons between different NPMs (and subplots) are not of substantial importance.
As expected, for the very low level of homoscedastic noise (H1 and H2 noise precision models), all of the examined methods performed well. With the increasing of noise standard deviation, the accuracy and precision of the methods became worse, and differences between methods started to appear.
Generalizing the results of all the five isotherms, the following statements can be placed. Regardless of the mathematical type of isotherm equation (rational, power, or exponential) and the number of parameters (two or three), the OLS method had the best properties. In the group of methods applicable in practice, it achieved mE values closest to zero and the lowest values of MARE, almost identical to ones determined by the theoretical method TODR. mE of the other tested methods showed higher discrepancy from zero, and MARE values were higher. ODR and MPSD methods had a very bad performance, while the results of the HYBRD method were somewhere in between.
Closeness of the results of OLS and TODR methods showed that in case of homoscedastic noise, the presence of measurement error on both axes is not of great importance, as it could be expected. What is more, the weighting by
For the two parameter isotherms (Langmuir and Freundlich), linearized models were tested due to their popularity. Modeling of linearized Langmuir equation was the only exception from the rule that the greater the population standard deviation of the noise, the greater the discrepancies of parameter estimates from their true values. Regardless of the level of noise, the LIN method presented equally bad results. The mE was about −65% and MARE in the range 60–85% for both of the parameters. For the Freundlich isotherm, LIN model was more accurate than HYBRD, MPSD, and ODR and had about the same variability as MPSD, still resigning in the group of modeling approaches whose usage is not advised in the case of homoscedastic data.
Looking at the subplots 3–7 of Figures
The ODR method was generally as accurate as TODR. The greatest deviation of mE from zero was −3.2% for the RSD10% noise type and
In case when the equilibrium adsorption experiments are performed once and the estimates of standard deviation of the measurement error are not available weighting is restricted to be fixed or to be some function of measured variable. When the experiments are done in triplicate, this restriction is released since the estimates of standard deviations could be obtained. The adsorption literature, surprisingly, rarely takes into account these important statistical details related to the processing of data in regression analysis with replicate measurements. Commonly, but not properly, data from triplicate measurements are just averaged, and their mean values are further on processed like OLS. Weighted regression is a way of preserving the information and thus should be preferred.
In Figures A.1–A.5 (in Supplementary Material), the results of modeling Langmuir, Freundlich, Jovanovic, RedlichPeterson, and Sips isotherm are presented. At first glance, it can be noticed that the accuracy and precision of the methods are better than in case of one experiment per point.
The E3WODR method had the best properties, WLS performed slightly worse, and E3ODR was ranked the third. The exception was only the Freundlich equation, where MARE values tended to lower for E3ODR method in case of heteroscedastic noise. However, difference between E3WODR and E3ODR was less than 1%, and thus this particular behavior is not of great importance.
The accuracy of model parameters will depend on whether the appropriate conceptual model was chosen, whether the experimental conditions were representative of environmental conditions, and whether an appropriate parameter estimation method was used.
Recently our group faced the problem of modeling the adsorption isotherms [
It was demonstrated that trends that could be noticed do not show dependence on isotherm type. Only the magnitude of percent errors in parameter estimates classifies some of the equation types and their particular parameters as difficult to fit (
The accuracy and variability of orthogonal distance regressionbased methods (ODR for experiments performed once and E3ODR and E3WODR for experiments preformed in triplicate) are closely followed by the analog methods that do not take into account the influence of measurement error on both axes: MPSD and WLS.
Linearization of isotherm equations was once again discarded in this study. Since the survey of the literature published in last decade showed that in over 95% of the liquid phase adsorption systems the linearization is the preferred method [
Further research that is currently in progress in our group will hopefully resolve the issue of adequate model selection in adsorption studies.
RedlichPeterson isotherm constant
True equilibrium concentration
Experimentally determined equilibrium concentration
Mean of three experimentally determined equilibrium concentrations
True adsorbate initial concentration
Experimentally determined adsorbate initial concentration
Weights on the
Percentage error
The exponent in RedlichPeterson isotherm
Indices for running number of data points
Running number of numerical experiments
Coefficient equal to
Number of converged fits in a simulation
Parameter ordinal number in the isotherm equation
Freundlich adsorption constant related to adsorption capacity
The exponent in Jovanovic isotherm
The equilibrium adsorption constant in Langmuir equation
The RedlichPeterson isotherm parameter
The Sips isotherm parameter
The weight of adsorbent
Median percentage error
The Sips model exponent
Mean absolute relative error
Number of observations
Adsorption intensity in Freundlich isotherm
Number of parameters
Experimentally determined adsorbent equilibrium loading
Mean of three experimentally determined adsorbent equilibrium loadings
Equilibrium adsorbent loading calculated by isotherm equation
The Langmuir maximum adsorption capacity
The Sips maximum adsorption capacity
The Jovanovic maximum adsorption capacity
Estimates of population standard deviation for independent variable
Estimates of population standard deviation for dependent variable
The volume of adsorbate solution
Weights on the
Data point on the fitted curve that is the closest to the observation
Measurement error in dependent variable
Measurement error in independent variable
Population standard deviation of measurement error in dependent variable
Population standard deviation of measurement error in independent variable
Population standard deviation of measurement error in initial concentration
Population standard deviation of measurement error in equilibrium concentration
Vector of true parameters
Vector of estimated parameters
Vector of distances in
Vector of distances in
Vertical distance between observation and model function.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This study was supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia (Project no. III 43009).