A systematic theoretical basis is developed that optimizes an arbitrary number of variables for (i) modeling data and (ii) the determination of stationary points of a function of several variables by the optimization of an auxiliary function of a single variable deemed the most significant on physical, experimental or mathematical grounds from which all the other optimized variables may be derived. Algorithms that focus on a reduced variable set avoid problems associated with multiple minima and maxima that arise because of the large numbers of parameters. For (i), both approximate and exact methods are presented, where the single controlling variable k of all the other variables Pk passes through the local stationary point of the least squares metric. For (ii), an exact theory is developed whereby the solution of the optimized function of an independent variation of all parameters coincides with that due to single parameter optimization of an auxiliary function. The implicit function theorem has to be further qualified to arrive at this result. A nontrivial real world application of the above implicit methodology to rate constant and final concentration parameter determination is made to illustrate its utility. This work is more general than the reduction schemes for conditional linear parameters since it covers the nonconditional case as well and has potentially wide applicability.
1. Introduction
The following theory is a systematic development of all functions covering properties of constrained and unconstrained functions that are continuous and differentiable to various specified degrees [1, 2] and the proof of the existence of implicit functions [3] for the form of these functions to be optimized. The implicit function theorem is applied in a manner that requires further qualification because the optimization problem is of an unconstrained kind without any redundant variables. Methods (i)a,b (described in Sections 2 and 3, resp.) refer to modeling of data [4, Chapter 15, pages 773–806] where the form of the function QMD(P,k) with independently varying variables (P,k) is
(1)QMD(P,k)=∑i=1N′(yi-f(P,ti,k))2,
where yi and ti are datapoints and f is a known function, and optimizations of QMD may be termed a least squares (LS) fit over parameters (P,k) which are independently optimized for N′ datasets. Method (ii) focuses on optimizing a general QE(P,k) function, not necessarily LS in form. There are many standard and hybrid methods to deal with such optimization [4, Chapter 10], such as golden section searches in 1D, simplex methods over multidimensions [4, pages 499–525], steepest descent and conjugate methods [5], and variable metric methods in multidimensions [4, pages 521–525]. Hybrid methods include multidimensional (DFP) secant methods [6], BFGS secant optimization [7], and RFO rational function optimization [8], which is a Newton-Raphson technique utilizing a rational function rather than a quadratic model for the function close to the solution point. Global deterministic optimization schemes combine several of the above approaches [9, Section 6.7.6]. Other physical methods, perhaps less easy to justify analytically, include probabilistic “basin-hopping” algorithms [9, Section 6.7.4], simulated annealing methods [10], and genetic algorithms [9, page 346]. An analytical justification on the other hand is attempted here for these deterministic methods, but in real-world applications some of the assumptions (e.g., C2 continuity, compactness of spaces) may not always be obtained. For what follows, the distance metrics used are all Euclidean, represented by |·| or ∥·∥, where det ∥·∥ represents the determinant of the matrix ∥·∥. Reduction of the number of variables to be optimized is possible in the standard matrix regression model only if conditional linear parameters β exist [11], where these β variables do not appear in the final S(θ) expression of the least squares function (2) to be optimized, whereas the ϕ nonconditional linear parameters do and are a subset of the θ variables; for the existence of each conditional linear parameter, there is a unit reduction in the number of independent parameters to be optimized. These reductions in variable number occur for any “expectation function” f(x,θ) which is the model or law for which a fitting is required, where there are N different datapoints xi, i=1,2,…,N that must be used to determine the p parameter variables θ [11, page 32, Chapter 2]. A conditionally linear parameter θi exists if and only if the derivative of the expectation function f(x,θ) with respect to θi is independent of θi. Clearly such a condition may severely limit the number of parameters that can be neglected for the expectation function variables when the prescribed matrix regressional techniques are employed [11, Section 3.5.5, page 85] where the residual sum of squares is minimized:
(2)S(θ)=∥y-η(θ)∥2.
The N-vectors η(θ) in P-dimensional space define the expectation surface. If the θ variables are partitioned into the conditional linear parameters β and the other nonlinear parameters ϕ, then the response can be written η(β,ϕ)=A(ϕ)β. Golub and Pereyra [12] used a standard Gauss-Newton algorithm to minimise S2(ϕ)=∥y-A(ϕ)β^(ϕ)∥2 that depended only on the nonlinear parameters ϕ, where β^(ϕ)=A+(ϕ)y with A+ being a defined pseudoinverse of A [11, Section 3.5.5, page 85], where A+ and A are matrices. The variables must be separable as discussed above and the number of variable reduction is only equal to the number of conditional linear parameters that exists for the problem. In applications, the preferred algorithm that exploits this valuable variable reduction is called variable projection. There are many applications in time resolved spectroscopy that is heavily dependent on this technique and many references to the method are given in the review by van Stokkum et al. [13]. Recently this method of variable projection has been extended in a restricted sense [14] in the field of inverse problems, which is not related to our method of either modeling or optimization, nor is the methodology related to the implicit function properties. In short, much of the reported methods developed are adhoc, meaning that they are constructed to face the specific problems at hand with no claim to overall generality and this work too is adhoc in the sense of suggesting variable reduction with specific classes of noninverse problems as indicated where the work develops a method of reducing the variable number to unity for all variables in the expectation function space irrespective of whether they are conditional or not by approximating their values by a method of averages (for method (i)a) without any form of linear regression being used in determining their approximations during the minimization iterations and without necessarily using the standard matrix theory that is valid for a very limited class of functions. Methods (i)b and (ii) are on the other hand exact treatments. No “elimination” of conditional linear parameters is involved in this nonlinear regression method. Nor is any projection in the mathematical sense involved. These general methods could have useful applications in deterministic systems comprising many parameters that are all linked to one variable: the primary one (denoted k here) that is considered on physical grounds to be the most significant. A generalization of this method would be to select a smaller set of variables than the full parameter list instead of just one variable as illustrated here. Another tool that could be used in conjunction with the reduced variable method is to employ the various search algorithms that has been actively developed to the reduction scheme developed here [15–19]. As far as we are aware, these optimization methods for multivariable problems all seem to mainly focus on various stochastic or deterministic methods using discrete algorithms in some type of search sequence of the domain space of the multivariable domain space {P,k,t} as detailed below in more recent publications. In other less ambitious works, the problem is narrowed to the specific nature of the system where the object function is specified and which is amenable to precise treatment, for example [20, 21], with a well-defined domain space, without variable reduction. In another context, quite different from the current development, variable reduction has been applied to DEA problems [22]. Other examples of multiparameter complex systems include those for multiple-step elementary reactions each with its own rate constant that gives rise to photochemical spectra signals that must be resolved unambiguously [23], but these belong to the class of functions with conditional linear parameters. The work here, on the other hand, predominantly focuses on accurately determining the range of the function to be optimized by reducing the space of the domain. Hence, this method can be successfully combined with the usual domain searching techniques mentioned above to effectively locate stationary points by a two-pronged approach. All these complex and coupled processes in physical theories are related by postulated laws Ylaw(P,k,t) that feature parameters (P,k). Other examples include quantum chemical calculations with many topological and orientation variables that need to be optimized with respect to the energy, but in relation to one or a few variables, such as the molecular trajectory parameter during a chemical reaction where this variable is of primary significance in deciding on the “reasonableness” of the analysis [9, Section 6.2.3, page 294]. Methods (i)a and (i)b below refer to LS data-fitting algorithms. Method (i)a is an approximate method where it is proved under certain conditions; it could be a more accurate determination of parameters compared to a standard LS fit using (1). Method (i)b develops a technique where the optimum value for QMD with domain values (P,k) coincides with that of the standard LS method where the (P,k) variables are varied independently. Also discussed are the relative accuracy of both methods (i)a in Section 2.2 and (i)b (endnote at end of Section 3). Method (ii) develops a single parameter optimization where the conditions of an arbitrary QOPT(P,k) function are met simultaneously; namely,
(3)∂QOPT(k)∂k=0⟶{∂QOPT(P,k)∂P=0,∂QOPT(P,k)∂k=0}.
We note that methods (i)a, (i)b, and (ii) are not related to the Adomian decomposition method and its variants that expand polynomial coefficients [24] for solutions to differential equations not connected to estimation theory; indeed here there are no boundary values that determine the solution of the differential equations.
2. Method (i)a Theory
This approximate method utilizes the average of the Nc unique solutions for each value of k defined above, where the form of the fitting function—a “law” of nature for instance—is specified. Deterministic laws of nature are conveniently written in the form
(4)Ylaw=Ylaw(P,k,t),
linking the variable Ylaw to t. The components of P, Pi (i=1,2,…,Np) and k are parameters. Verification of a law of form (4) relies on an experimental dataset {(Yexp(ti),ti),i=1,2,…,N}. The t variable could be a vector of variable components of experimentally measured values or a single parameter as in the kinetics examples below where ti denotes values of time t in the domain space. The vector form will be denoted by x. Variables (x) are defined as members of the “domain space” of the measurable system and similarly Ylaw is the defined range or “response” space of the physical measurement. Confirmation or verification of the law is based on (a) deriving experimentally meaningful values for the parameters (P,k) and (b) showing a good enough degree of fit between the experimental set Yexp(ti) and Ylaw(ti). In real world applications, to chemical kinetics, for instance, several methods [25–28] and so forth have been devised to determine the optimal (P,k) parameters, but most if not all these methods consider the aforementioned parameters as autonomous and independent (e.g., [26]). A similar scenario broadly holds for current state-of-the-art applications of structural elucidation via energy functions [9, Chapsters 4, 6]. To preserve the viewpoint of the interrelationship between these parameters and the experimental data, we devise schemes that relate P to k for all Pi via the set {Yexp(ti),ti} and optimize the fit over k-space only. That is there is induced a Pi(k) dependency on k via the experimental set {Yexp(ti),ti}. The conditions that allow for this will also be stated for the different methods.
2.1. Details ofMethod (i)a
Let Np be the number of components of the P parameter, N′ the number of dataset pairs (Yexp(ti),ti), and Ns the number of singularities where the use of a particular dataset (Yexp,t) leads to a singularity in the determination of P¯i(k) as defined below and which must be excluded from being used in the determination of P¯i(k). Then (Np+1)≤(N-Ns) for the unique determination of {P,k}. Let Nc be the total number of different datasets that can be chosen which does not lead to singularities. If the singularities are not choice dependent, that is, a particular dataset pair leads to singularities for all possible choices, then we have the following definition for Nc where N-NsCNp=Nc is the total number of combinations of the data-sets {Yexp(ti),ti} taken Np at a time that does not lead to singularities in Pi. In general, Nc is determined by the nature of the datasets and the way in which the proposed equations are to be solved. Write Ylaw in the form
(5)Ylaw(t,k)=f(P,t,k),
and for a particular dataset {Yexp(ti),ti}, write f(i)≡f(P,ti,k). Define the vector function fg with components fg(i)≡Yexp(ti)-f(i)=fg(i)(P,k). Assume fg∈C1 defined on an open set K0 that contains k0.
Lemma 1.
For any k0 such that det∥∂fg(i)(P,k0)/∂Pj∥≠0, ∃ the unique function P(k)∈C1(withcomponentsPi(k)⋯PNc(k)) defined on K0 where P(k0)=P0, and where fg(P(k),k)=0 for every k∈K0.
Proof.
The above follows from the implicit function theorem (IFT) [3, Theorem 13.7, page 374] where k∈K0 is the independent variable for the existence of the P(k) function.
We seek the solutions for P(k) subject to the above conditions for our defined functions. Map f→Yth(P¯,t,k) as follows:
(6)Yth(t,k)=f(P-,t,k),
where the term P¯ and its components are defined below and where k is a varying parameter. For any of the (i1,i2,…,iNp) combinations denoted by a combination variable α≡(i1,i2,…,iNp) where ij≡(Yexp(tij),tij) is a particular dataset pair, it is in principle possible to solve for the components of P¯ in terms of k through the following simultaneous equations:
(7)Yexp(ti1)=f(P,ti1,k),Yexp(ti2)=f(P,ti2,k),⋮Yexp(tiNp)=f(P,tiNp,k),
from Lemma 1. And each α choice yields a unique solution Pi(k,α) (i=1,2,…,Np), where Pi(k,α)∈C1. Hence any function of Pi(k,α) involving addition and multiplication is also in C1. For each Pi, there will be Nc different solutions, Pi(k,1),Pi(k,2),…,Pi(k,Nc). We can define an arithmetic mean (there are several possible mean definitions that can be utilized) for the components of P¯ as
(8)Pi¯(k)=1Nc∑j=1NcPi(k,j).
In choosing an appropriate functional form for P¯ (8) we assumed equal weightage for each of the dataset combinations; however, the choice is open, based on appropriate physical criteria. We verify below that the choice of P¯(k) satisfies the constrained variation of the LS method so as to emphasize the connection between the level-surfaces of the unconstrained LS with the line function P¯(k).
Each Pi(k,j) is a function of k whose derivative is known either analytically or by numerical differentiation. To derive an optimized set, then for the LS method, define
(9)Q(k)=∑i=1′N′(Yexp(ti)-Yth(k,ti))2.
Then for an optimized k, we have Q′(k)=0. Defining
(10)R(k)=c·∑i=1′N′(Yexp(ti)-Yth(k,ti))·Yth′(k,ti),
the optimized solution of k corresponds to R(k)=0 which has been reduced to a one-dimensional problem. The standard LS variation on the other hand states that the variables PT={P,k} in (5) are independently varied so that
(11)QT(PT)=∑i=1N′(Yexp(ti)-f(PT,ti))2,
with solutions for QT in terms of PT whenever ∂QT/∂PT=0. Of interest is the relationship between the single variable variation in (9) and the total variation in (11). Since P¯ is a function of k, then (11) is a constrained variation where
(12)δQ(k)=δQ(P¯,k)=(∂Q∂P¯·δP¯+∂Q∂k),
subjected to gi(P¯,k)=P¯i-hi(k)=0 (i.e., P¯i=hi(k) for some function of k) and where P¯i are the components of P¯. According to the Lagrange multiplier theory [3, Theorem 13.12, page 381] the function f:Rn→R has an optimal value at x0 subject to the constraints g:Rn→Rm over the subset S where g=(g1,g2⋯gm) vanishes; that is, x0∈X0, where X0={x:x∈S,g(x)=0} when either of the following equivalent equations ((13), (14)) are satisfied:
(13)Drf(x0)+∑k=1mλkDrgk(x0)=0(r=1,2,…,n),(14)∇f(x0)+λ1∇g1(x0)+⋯+λm∇gm(x0)=0,
where det∥Djgi(x0)∥≠0 and the λ’s are invariant real numbers. We refer to P¯i as any variable that is a function of k constructed on physical or mathematical grounds, and not just to the special case defined in (8). Write
(15)gi=P¯i-hi(k)=0(i=1,2,…,Np),
where Djgi(x0)=δij since Dj=∂/∂P¯j and therefore det∥Djgi(x0)∥≠0. We abbreviate the functions f(i)=f(P,ti,k) and f¯(i)=f(P¯,ti,k). Define
(16)fQ(x)≡Q(P¯,k,t)=∑i=1′N′(Yexp(ti)-f¯(i))2,
where Yexp(ti) are the experimental subspace variables as in (7) with x∈X0 defined above. We next verify the relation between Q(k) and QT.
Verification. The solution Q′(k)=R(k)=0 of (10) is equivalent to the variation of fQ(x) defined in (16) subjected to constraints gi of (15).
Proof.
Define the Lagrangian to the problem as L=fQ(x)+∑i=1Npλigi. Then the equations that satisfy the stationary condition
(17)∂L∂P¯j=0,j=1,2,…,Np;∂L∂k=0
reduce to the (equivalent) simultaneous equations
(18)∑i=1N′(Yexp(ti)-f¯(i))∂f¯(i)∂P¯j=λj′,j=1,2,…,Np,(19)∑i=1N′(Yexp(ti)-f¯(i))∂f¯(i)∂k+∑j=1Npλj′∂pj∂k=0.
Substituting λj′ in (18) to (19) leads to
(20)∑i=1N′(Yexp(ti)-f¯(i))∂f¯(i)∂k+∑j=1Np∑i=1N′(Yexp(ti)-f¯(i))·∂f¯(i)∂P¯j·∂pj∂k=0.
Since dP¯i/dk=∂p¯i/∂k, then (20) implies dQ(P¯,k,t)/dk=0 for the Q functions in (11), (12), and (16).
Of interest is the theoretical relationship of the (P¯,k) variables of the Q functions described by (9), (12), (16) denoted Q1 and those of the freely varying Q function of (1) denoted Q2 with the variable set which can be written as
(21)Q1=Q(P¯,t,k),(22)Q2=Q(P,t,k),
which is given by the following theorem, where we abbreviate αi=(Yexp(ti)-f(P,ti,k)) and α¯i=(Yexp(ti)-f(P¯,ti,k)), where we note that the f functional form is unique and is of the same form for both these α variables.
Theorem 2.
The unconstrained LS solution to Q2=Q(P,k,t) for the independent variables {P,k} is also a solution for the constrained variation single variable k′, where P=P¯(k′), k=k′. Further, the two solutions coincide if and only if
(23)∑i=1N′(Yexp(ti)-f(P¯,ti,k))∂f(P¯,ti,k)∂P¯j=0,iiiiiiiiiiiiiiiiiiiiiiiiiiiii11iiiiiiij=1,2,…,Np.
Proof.
The Q2 unconstrained solution is derived from the equations
(24)∂Q2∂Pj=c·∑i=1N′αi∂f(i)∂Pj=0,j=1,2,…,Np,(25)∂Q2∂k=c·∑i=1N′αi∂f(i)∂k=0,
with c being constants. If there is a P¯(k) dependency, then we have
(26)dQ1(k)dk=c·∑j=1Np(∑i=1N′α¯i∂f¯(i)∂Pj)∂P¯j∂k+c·∑i=1N′α¯i∂f¯(i)∂k.
If the variable set {P,k} satisfies (24) and (25) in unconstrained variation, then the values when substituted into (26) satisfy the equation dQ1(k)/dk=0 since f(i) and f¯(i) are the same functional form. This proves the first part of the theorem. The second part follows from the converse argument, where from (26), if dQ1(k)/dk=0, then setting one factor to zero in (27) leads to the implication of (28)
(27)∑i=1N′α¯i∂f¯(i)∂P¯j=0,j=1,2,…,Np,(28)⟹∑i=1N′α¯i∂f¯(i)∂k=0,
which is the solution set {P¯(k′),k′} which satisfies dQ1(k)/dk=0 and is satisfied by the conditions of both (27) and (28). Then (27) satisfies (24) and (28) satisfies (25).
The theorem, verification, and lemma above do not indicate topologically under what conditions a coincidence of solutions for the constrained and unconstrained models exists. Figure 1 depicts the discussion below. From Theorem 2, if set A represents the solution {P,k} for the unconstrained LS method and set B={P¯,k′} for the constrained method, then B⊇A. Define k within the range k1≤k≤k2. Then k is in a compact space, and since P(k)∈C1, P(k) is uniformly continuous [2, Theorem 8, page 79]. Then admissible solutions to the above constraint problem with the inequality B⊇A imply Q(P(k))≥Qmin, where Qmin is the unconstrained minimum. The unconstrained Q=QT LS function to be minimized in (11) implies
(29)∇Q=0,(∂Q∂k=∂Q∂Pi=0,fori=1,2,…,Np).
Defining the constrained function Qc(k)=∑i=1N′(Yexp(i)-f(P(k),ti,k))2, then Qc(k)=Q∘PT where PT=(P1(k),P2(k),…,PNp(k),k)T. Because Qc′(k)=(∂Q/∂P,∂Q/∂k)·(P1′(k)⋯PNp(k),k)T, solutions occur when (i) ∇Q=0 corresponding to the coincidence of the local minimum of the unconstrained Q for the best choice for the line with coordinates (P(k),k) as it passes through the local unconstrained minimum and (ii) Pi′(k)=0, (i=1,2,…,Np), ∂Q/∂k=0 where this solution is a special case of (iii) when the vector PT′ is ⊥ to ∇Q≠0; that is, PT′ is at a tangent to the surface Q=S2 for some S2≥Qmin where this situation is shown in Figure 1, where the vector is tangent at some point of the surface QT=S2. Whilst the above characterizes the topology of a solution only, the existence of a solution for the line (P(k),k) which passes through the point of the unconstrained minimum of QT is proven below under certain conditions where a set of equations are constructed to allow for this significant application that specifies the conditions when the standard LS constrained variation solution implies the same solution as for the unconstrained variation. Also discussed is the case when it may be possible for unconstrained solution set U to satisfy the inequality Qc(U)≥Qc, where Qc is a function designed to accommodate all solutions of (7), as given below in (30).
Depiction of how the k variation optimizing Q leads to a solution on a level surface of the QT function where QT≤Q.
2.2. Discussion of LS Fit for a Function Qc with a Possibility of a Smaller LS Deviation than for {P,k} Parameters Derived from a Free Variation of (11)
The LS function metric such as (11) implied Q(P,k)≤Q(P¯,k) at a stationary (minimum) point for variables (P,k). On the other hand, the sets of solutions of (7) denoted {}i, Nc in number provides for each set exact solutions P(k) averaged to P¯(k) using (8). If the {Pi}, i=1,2,…,Nc solutions are in a δ-neighbourhood, then we examine the possibility that the composite function metric to be optimized over all the sets of equations {}i, Nc in number defined here as
(30)Qc(P,k)=∑i∈{}1,{}2⋯{}Nc(Yexp-f(i))2,
could be such that
(31)Qc(PT,k)≥Qc(P¯,k),
where (PT,k) is the unconstrained optimized value of (22). This implies that under these conditions, the Qc of (30) is a better measure of fit. This will be proven to be the case under certain conditions below. For what follows, the P(k){}i for equation set {}i obtains for all k values of the open set S, k∈S, from the IFT, including k0 which minimises (9). Another possibility that will be discussed briefly later is where in (30), all {P,k} are free to vary. Here we consider the case of the NcP values averaged to P¯ for some k. We recall the intermediate-value theorem (IVT) [3, Theorem 4.38, page 87] for real continuous functions f defined over a connected domain S which is a subset of some Rn. We assume that the f functions immediately below obey the IVT. For each {Pi} solution of the {}i set for a specific k=k0 we assume that the function Qc,i(P) is a strictly increasing function in the sense of definition (7) below, where
(32)Qc,i(P)=∑j∈{}i(Yexp(j)-f(j))2,
with Qc,i(P)(k0)=0, in the following sense.
Definition 3.
A real function f is (strictly) increasing in a connected domain S∈RN about the origin at r0 if relative to this origin, if |r2|>|r1| (for the boundaries ∂ of ball B(r0,r1) and B(r0,r2)) implies both maxf(∂B(r0,r2))(>)≥maxf(∂B(r0,r1)) and minf(∂B(r0,r2))(>)≥minf(∂B(r0,r1)).
Note. A similar definition is obtained for a (strictly) decreasing function with the (<)≤ inequalities. Since the ∂B boundaries are compact and f is continuous, the maximum and minimum values are attained for all ball boundaries. We assume f to be strictly increasing relative to r0 for what follows below.
Lemma 4.
For any region bounded by ∂B(r0,r1) and ∂B(r0,r2) with coordinate r (radius r centered about coordinate r0),
(33)minf(∂B(r0,r1))<f(r)<maxf(∂B(r0,r2)).
Proof.
Suppose in fact f(r)<minf(∂B(r0,r1)); then
(34)minf(∂B(r0,r))<minf(∂B(r0,r1))(r>r1)
which is a contradiction to the definition and a similar proof is obtained for the upper bound.
Note. Similar conditions apply for the nonstrict inequalities ≤,≥.
The function that is optimized is
(35)Qc(P)=∑i=1NcQc,i(P).
Define Pi as the solution vector for the equation set {}i. We illustrate the conditions where the solution PT for a free variation for the Q metric given in (11) can fulfill the inequality where Qc is as defined in (35)
(36)Qc(PT)≥Qc(P¯)
with P¯ given as in (8). A preliminary result is required. Define max∥Pi-Pj∥=δ, for all i,j=1,2,…,Nc and δPi=P¯-Pi.
Lemma 5.
∥δPi∥≤δ.
Proof.
(37)∥P¯-Pi∥=1Nc∥∑qPq-NcPi∥≤1Nc∑j=1Nc∥Pj-Pi∥≤δ.
Lemma 6.
Qc(PT,k)>Qc(P¯,k) for δ<∥PT-Pi∥<δT for some δT.
Proof.
Any point P¯ would be located within a spherical annulus centered at Pi, with radii chosen so that by Lemma 4, we have the following results:
(38)ϵmax,i>Qc,i(P¯)>ϵmin,i,
where f=Qc,i in (32). Choose δi so that δi<∥δPi∥<δ. Define Ann(δ,δi,Pi) as the space bounded by the boundary of the balls centered on Pi of radius δ and δi (δ>δi). Then P¯∈Ann(δ,δi,Pi) by Lemma 5. Since Qc in (30) is not equivalent to QT=Q in (11) where we write here the free variation vector solution as PT, then the above results lead to the following:
(39)∑i=1Ncϵmin,i<Qc(P¯)=∑i=1NcQc,i(P¯)<∑i=1Ncϵmax,i,(40)ϵmax,i<Qc,i(PT)<ϵT,i,
where (40) follows from (33). Summing (40) leads to Qc(PT,k)>Qc(P¯,k).
Hence we have demonstrated that it may be more realistic or accurate to fit parameters based on a function that represents different coupling sets such as Qc above rather than the standard LS method using (30) if PT lies sufficiently far away from P¯. We note that if PT is the solution of the free variation of the above Qc in (30), then from the arguments presented after the proof of Theorem 2, it follows that
(41)Qc(PT)≤Qc(P¯),
which implies that the independent variation of all parameters in LS optimization of the Qc variation is the most accurate functional form to use assuming equal weighting of experimental measurements than the standard free variation of parameters using the Q function of (11).
3. Method (i)b Theory
Whilst it is advantageous in science data analysis to optimize a particular multiparameter function by focusing on a few key variables (our k variable of restricted dimensionality, which we have applied to a 1-dimensional optimization in the next section), it has been shown that this method yields a solution that is always of higher value for the same Q function than a full, independent parameter optimization, meaning that it is less accurate. The key issue, therefore, is whether for any Q function, including those of the Qc variety, it is possible to construct a k parameter optimization such that the line of parameter variables P(k) passes through the minimum surface of the Q function. We develop a theory to construct such a function below. However, method (i)a may still be advantageous because of the greater simplicity of the equations to be solved, and the fact that C1f(i) functions were required, whereas here the f(i) functions must be at least C2 continuous.
Theorem 7.
For the QT(P,k) function defined in (11), where each of the f(i) functions is C2 on an open set RNp+1 and where QT is convex, the solution at any point k of ∂QT/∂Pj=0, (j=1,2,…,Np) whenever det∥∂QT/∂Pi∂Pj∥≠0 at (dQT/dk)(k′)=0 determines uniquely the line equation P(k) that passes the minimum of the function QT when k=k′.
Proof.
As before f(i)=f(P,ti,k), so that
(42)QT=Q(P,k)=∑i=1N′(Yexp(ti)-f(i))2.
Define ∂f(i)/∂Pj=f(i,j), αk=Yexp(tk)-f(k) and for an independent variation of the variables (P,k) at the stationary point, we have
(43)∂QT∂Pj=hj(P,k)=c·∑i=1N′(Yexp(ti)-f(i))f(i,j)=0,iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii11iiiiiiiiiiiiiiij=1,2,…,Np,(44)∂QT∂k=I(P,k)=c·∑i=1N′αi∂f(i)∂k=0.
The above results for the functions hj(P,k)=0 (j=1,2,…,Np) having a unique implicit function of k, denoted P(k) by the IFT [3, Theorem 13.7, page 374], require that det∥∂hi(P,k)/∂Pj∥=det∥∂QT/∂Pj∂Pi∥≠0 on an open set S,k∈S. More formally, the expansion of the preceding determinant in (45) verifies that a symmetric matrix is obtained for ∂hi(P,k)/∂Pj due to the commutation of second-order partial derivatives of Pj(45)∂hi(P,k)∂Pj=c·∑lN′(αk∂2f(l)∂Pj∂Pi-∂f(l)∂Pj·∂f(l)∂Pi).
Defining Q1(k) as a function of k only by expanding QT yields the total derivative with respect to k as Q1′(k), where
(46)Q1(k)=∑iN′(αi)2,(47)Q1′(k)=c·∑j=1NpdPjdk·∑i=1N′αi∂f(i)∂Pj+c·∑i=1N′αi∂f(i)∂k.
Then hi(P,k)=0 by construction (43) so that ∂QT/∂Pj=0 (j=1,2,…,Np) and (43) implies ∑i=1N′αi(∂f(i)/∂Pj)=0 (for all j) and hence
(48)(dPjdk·∑i=1N′αi∂f(i)∂Pj)=0.
Substituting (48) derived from (43) and (44) into (47) together with the condition Q1′(k)=0 implies that c·∑i=1N′αi(∂f(i)/∂k)=0, which satisfies (44) for the free variation in k. Thus, Q1′(k)=0⇒δQT=0 for independent variation of (P,k). So QT fulfills the criteria of a stationary point at say k=k0, since ∇P,kQT=0 ([2, Proposition 16, page 112]). Suppose that QT is convex, where PT={P0,k0} is a minimum point, PT∈D, a convex subdomain of QT. Then at PT, ∇P,kQT=0, and PT is also the unique global minimum over D according to [1, Theorem 3.2, page 46]. Thus PT is unique, whether derived from a free variation of (P,k) or via P(k) dependent parameters with the Q1 function.
Note. As before, QT and Q1 may be replaced with the summation of indexes as for Qc in (30) to derive a physically more accurate fit.
4. Method (ii) Theory
Methods (i)a and (i)b which are mutual variants of each other are applications of the implicit method to modeling problems to provide a best fit to a function. Here, another variant of the implicit methodology for optimization of a target or cost function QE is presented. One can for instance consider QE(P,k) to be an energy function with coordinates R={P,k}, where as before the components of P are Pj, j=1,2,…,Np, k∈R is another coordinate so that R∈RNp+1. For bounded systems, (such as the molecular coordinates), one can write
(49)lmin,i≤Pi≤lmax,i,i=1,2,…,Np;lmin,k≤k≤lmax,k.
Thus, R∈D⊂RNp+1 is in a compact space D. Define
(50)∂QE∂Pj=oj(P,k),j=1,2,…,Np,(51)∂QE∂k=κ(P,k).
Then the equilibrium conditions become
(52)oj(P,k)=0,κ(P,k)=0.
Take (50) as the defining equations for oj(P,k) which is specified by QE in (50) which casts it in a form compatible with the IFT where some further qualification is required for (P,k). Assume QE is C2 on D, and det∥∂oi/∂Pj∥≠0, where ∂oi/∂Pj≡∂2QE/∂Pj∂Pi. The matrix of the aforementioned determinant is symmetric, partaking of the properties due to this fact. Then by the IFT [3, Theorem 13.7, page 374], ∃ is a unique P(k) function where for some k0, oj(P(k0),k0)=0, oj∈C1 on T0 with (T0×R)⊆D and k0∈T0 such that oj(P(k),k)=0 for all k∈T0. For k an isolated point k=ko, from analysis, we find k to be still open. Write QE,1(k)=QE(P(k),k), and QE,1′(k)=dQE,1/dk, so that
(53)dQE,1dk=∑j=1Np(∂QE∂Pj·dPjdk+∂QE∂k)=∑j=1Np(oj·dPjdk+∂QE∂k).
Denote ki as a solution to QE,1′(k)=0, where ki∈T0 in the indicated range above in (49).
Theorem 8.
The stationary points ki (i=1,2,…,Nk) where QE,1(k)=0 for k=ki exist for the range {ki:kmin≤k≤kmax} of coordinate k if and only if for each of these ki, (i) QE(P,k)∈C2 and (ii) det∥∂QE(P,ki)/∂Pj∂Pi∥≠0 (for all ki, i=1,2,…,kmax). Each of these points ki∈R1 space corresponds uniquely in a local sense in the open set T0 to some equilibrium (stationary) point of the target function QE(P,k) in RNp+1 space.
Proof.
If QE,1′(k)=0 where k∈T0, then it also follows from the IFT that oj=0, and therefore ∂QE/∂k=0 from (53), which satisfies (50) and (51) for the equilibrium point. The conditions (i) and (ii) of the theorem are a requirement of the IFT. Conversely, if oj=0, (j=1,2,…,Np) and ∂QE/∂k=0 (a stationary or equilibrium point), then by (53) QE,1′(k)=0. Hence, the coordinates {ki} for which QE,1′(ki)=0 refer to the condition, where δQE(P,ki)=0, and uniqueness follows from the IFT reference to the local uniqueness of the P(k) function.
Note. In a bounded system, one can choose any of the Np components Pj of P as the k coordinate (denoted ki), partly based on the convenience of solving the implicit equations to determine the ki minima and thus determine by the uniqueness criterion the coordinates of the minima in RNp+1 space (spanning the Np independent variables and k).
For nondegenerate coordinate choice, meaning that for a particular k coordinate choice, there does not exist an equilibrium structure (meaning a set of coordinate values) where for any two structures A and B, kA=kB. For such structures, the total number of minima that exists within the bounded range in the k coordinate is equal to the total number of minima of the target function QE(P,k) within the bounded range. Hence, a method exists for the very challenging problem of locating and enumerating minima [9, Section 5.1, page 242 “How many stationary points are there?”]. From the uniqueness theorem of IFT, one could infer points in the k-axis where nonuniqueness is obtained; that is, whenever det∥∂2QE/∂Pi∂Pj∥=0. In such cases, for particles with the same intermolecular potentials, permutation of the coordinates in conjunction with symmetry considerations could be of use in selecting the appropriate coordinate system to overcome these systems with degeneracies [9, Section 4.2.5, page 205, “Appearance and disappearance of symmetry elements”]. Other methods that might address this situation include scanning different one dimensional (1D) choices k=Pi graphs or profiles, where if degeneracies exist for choice k=Pj, they may not exist for the choice k=Pi in the graph for a specific point (Pi≠Pj). Thus, by scanning through all or selecting a number of the (1D) Pj profiles for QE,1, it would be possible to make an assignment of the location of a minimum in RNp+1 space. One is reminded of the methods that spectroscopists use in assigning different energy bands based on selection rules to uniquely characterize, for instance, vibrational frequencies. A similar analogy is obtained for X-ray reflections, where the amplitude variation of the X-ray intensity in reciprocal space can be used to elucidate structure. The minima of the 1D, k coordinate scan must correspond to the minima in RNp+1 space of the QE function given that all such minima in QE are locally strict and global within a small open set about the minima for by continuity, QE,1(k)-QE,1(k0)>0 for |k-k0|<δ and for |P(k)-P(k0)|<δ2, which violates the condition for a maximum.
5. Specic Algorithms and Pseudocode for Solution to Optimization Problem Utilizing Method (i)(a), Method (i)(b) and Method (ii)
We provide suggestions in pseudocode form for the above 3 proven methodologies. Real world applications of these methods are very involved undertakings that are separate research topics in their own right. Nevertheless, we provide a detailed and extended application of method (i)a suitable for a real world chemical kinetics problem where experimental data from the published literature are used for method (i)a in Section 5.3.1 and the results obtained compared to conventional techniques.
5.1. Pseudocode Algorithm for Method (i)a
Of the many variations possible, the following approach conforms to the theoretical development.
For any physical law, for a total of N′ datapoints, choose Np datapoints (set α) and solve for parameters Pj, j=1,2,…,Np according to (7) for each of the sets α, Nc in number for a known value of k. The solution set {P(k)}α may be derived analytically (as in the example below) or by appropriate linear approximations.
Determine from the above set P¯(k) either the geometric or statistical average (as used here) for the Nc solutions; that is, P¯(k)=∑α=1NcPα(k).
Determine Q(k) (9) as
(54)Q(k)=∑i=1′N′(Yexp(ti)-Yth(k,ti))2.
Solve the 1D equation at k=k0 when Q′(k)=0. The solution set is {P(k0),k0} for the optimization problem.
5.2. Pseudocode Algorithm for Method (i)b
This is an “exact” method relative to LS variation of all parameters. A suitable algorithm based on the theory could be as follows.
Solve
(55)hj(P,k)=c·∑i=1N′(Yexp(ti)-f(i))f(i,j)=0,iiiiiiiiiiiiiiiiiiiiiiiiiiiij=1,2,…,Np,
for a particular value of k. Since there are Np equations for hj, a solution P(k) exists. The solution may be exact or some linear approximation, depending on nature of the problem and convergence criteria.
Form the function Q1(k)=∑iN′(αi)2, (46), where αi=Yexp(ti)-f(P,ti,k).
Solve Q1′(k)=0 for some k0; that is, Q1′(k0)=0.
The solution set to the problem is {P(k0),k0} for optimizing the LS function QT=Q(P,k)=∑i=1N′(Yexp(ti)-f(P,ti,k))2.
5.3. Pseudocode Algorithm for Method (ii)
Here QE(P,k) is a general function, not necessarily of form QT in (42). Then for Np variables {P}, define functions oj(P,k)=∂QE/∂Pj, j=1,2,…,Np [22], κ(P,k)=∂QE/∂k as in (51).
For a particular k, solve [22] for P. P(k) exists since there are Np equations. Approximate linearized solutions might also be attempted in the vicinity of the station point of k.
Form the function QE,k(k)=QE(P,k).
Solve for k0, such that QE,k′(k)=0.
Solution to the optimization problem of QE(P,k) by varying independently all the domain variables is {P(k0),k0}.
5.3.1. Application of Method (i)a Algorithm (Section 5.1) in Chemical Kinetics
The utility of one of the above triad of methods is illustrated in the determination of two parameters in chemical reaction rate studies, of 1st and 2nd order, respectively, using data from published literature, where method (i)a yields values close within experimental error to those quoted in the literature. The method can directly derive certain parameters like the final concentration terms (e.g., λ∞ and Y∞) if k, the rate constant, is the single optimizing variable in this approximation, which is not the case in most conventional methodologies. We assume here that the rate laws and rate constants are not slowly varying functions of the reactant or product concentrations, which have recently, from simulation, been shown to be generally not the case [29]. Under this standard assumption, the rate equations below are all obtained. The first-order reaction studied here is (i) the methanolysis of ionized phenyl salicylate with data derived from the literature [30, Table 7.1, page 381] and the second-order reaction analyzed is (ii) the reaction between plutonium(VI) and iron(II) according to the data in [31, Table II, page 1427] and [32, Tables 2–4, page 25].
5.3.2. First-Order Results
Reaction (i) above corresponds to
(56)PS-+CH3OH⟶kaMS-+PhOH
where the rate law is pseudo first-order expressed as
(57)rate=ka[PS]-=kc[CH3OH][PS-]
with the concentration of methanol held constant (80% v/v) and where the physical and thermodynamical conditions of the reaction appear in [30, Table 7.1, page 381]. The change in time t for any material property λ(t), which in this case is the absorbance A(t) (i.e., A(t)≡λ(t)) is given by
(58)λ(t)=λ∞-(λ∞-λ0)exp(-kat)
for a first-order reaction where λ0 refers to the measurable property value at time t=0 and λ∞ is the value at t=∞ which is usually treated as a parameter to yield the best least squares fit even if its optimized value is less for monotonically increasing functions (for positive dλ/dt at all t) than an experimentally determined λ(t) at time t. In Table 7.1 of [30], for instance, A(t=2160s)=0.897>Aopt,∞=0.882 and this value of A∞ is used to derive the best estimate of the rate constant as 16.5±0.1×10-3sec-1 in that work.
For this reaction, the Pi of (5) refers to λ∞ so that P≡λ∞ with Np=1 and k≡ka. To determine the parameter λ∞ as a function of ka according to (9) based on the entire experimental {(λexp,ti)} dataset we invert (58) and write
(59)λ∞(k)=1N′∑i=1′N′(λexp(ti)-λ0exp-kti)(1-exp-kti),
where the summation is for all terms with the i subscript of the experimental dataset that does not lead to zeros nor singularities, such as when ti=0. We define the nonoptimized, continuously deformable theoretical curve λth, where λth≡Yth(t,k) in (6) as
(60)λth(t,k)=λ∞(k)-(λ∞(k)-λ0)exp(-kat).
With such a relationship of the λ∞ parameter P to k, we seek the least square minimum of Q1(k), where Q1(k)≡Q of (9) for this first-order rate constant k in the form
(61)Q1(k)=∑i=1N(λexp(ti)-λth(ti,k))2,
where the summation is over all the experimental (λexp(ti),ti) values. The solution of the rate constant k corresponds to the zero value of the function, which exists for both orders. The P parameters (λ∞ and Y∞) are derived by back substitution into (59) and (65), respectively. The Newton-Raphson (NR) numerical procedure [4, page 456] was used to find the roots to Pk. For each dataset, there exists a value for λ∞ and so the error expressed as a standard deviation may be computed. The error tolerance for the NR procedure was set to 1.0×10-10. We define the function deviation fd as the standard deviation of the experimental results with the best fit curve where fd=(1/N){∑i=1N(λexp(ti)-λth(ti))2}. Our results are as follows: ka=1.62±.09×10-2s-1; λ∞=0.88665±.006; and fd=3.697×10-3.
The experimental estimates are ka=1.65±.01×10-2s-1; λ∞=0.882±0.0; and fd=8.563×10-3.
The experimental method involves adjusting the A∞≡λ∞ to minimize the fd function and hence no estimate of the error in A∞ could be made. Method (i)a allows direct calculation of λ∞ and its error without the extraneous fittings required in the conventional methods. It is clear that our method has a lower fd value and is thus a better fit, and the parameter values can be considered to coincide with the experimental estimates within experimental error. Figure 2 shows the close fit between the curve due to our optimization procedure and experiment. The resulting Rk function (10) for the first-order reaction based on the published dataset is given in Figure 3. The very slight variation between the two curves could be due to experimental uncertainties as shown in Figure 1.
Plot of the experimental and curve with optimized parameters showing the very close fit between the two. The slight difference between the two can probably be attributed to experimental errors.
R(k) functions (10) for reactions (i) and (ii) of order one and two in reaction rate. The horizontal bars correspond to the zero values of the R(k) functions.
5.3.3. Second-Order Results
To further test our method, we also analyze the second-order reaction
(62)Pu(VI)+2Fe(II)⟶kbPu(IV)+2Fe(III)
whose rate v is given by v=k0[PuO22+][Fe2+] where k0 is relative to the constancy of other ions in solution such as H+. The equations are very different in form to the first-order expressions and serves to confirm the viability of the current method.
For Espenson, the above stoichiometry is kinetically equivalent to the reaction scheme [32, equation (2-36)]
(63)PuO22++Feaq2+⟶kbPuO2++Feaq3+
which also follows from the work of Newton and Baker [31, equations (8),(9), page 1429] whose data [31, Table II, page 1427] we use and analyze to verify the principles presented here. Espenson had also used the same data as we have to derive the rate constant and other parameters [32, pages 25-26] which are used to check the accuracy of our methodology. The overall absorbance in this case Y(t) is given by [32, equation (2-35)]
(64)Y(t)=Y∞+{Y0(1-α)-Y∞}exp(-kΔ0t)1-αexp(-kΔ0t),
where α=[A]0/[B]0 is the ratio of initial concentrations where [B]0>[A]0 and [B]=[Pu(VI)], [A]=[Fe(II)] and [B]0=4.47×10-5M and [A]0=3.82×10-5M. A rearrangement of (64) leads to the equivalent expression [32, equation (2-34)]
(65)ln{1+Δ0(Y0-Y∞)[A]0(Yt-Y∞)}=ln[B]0[A]0+kΔ0t.
According to Espenson, one cannot use this equivalent form [32, page 25] “because an experimental value of Y∞ was not reported” and he further asserts that if Y∞ is determined autonomously, then k, the rate constant, may be determined. Thus, central to all conventional methods is the autonomous and independent status of both k and Y∞. We overcome this interpretation by defining Y∞ as a function of the total experimental spectrum of ti values and k by inverting (64) to define Y∞(k) as
(66)Y∞(k)=1N′∑i=1′N′Yexp(ti){exp(kΔ0ti)-1}+Y0(α-1)(exp(kΔ0ti)-1),
where the summation is over all experimental values that does not lead to singularities such as at ti=0. In this case, the P parameter is given by Y∞(k)=P1(k), kb=k is the varying k parameter of (5). We likewise define a function Yth of k that is also a function of t, but where the k parameter is interpreted as a “distortion” parameter in the following manner:
(67)Y(t,k)th=Y∞(k)+{Y0(1-α)-Y∞(k)}exp(-kΔ0t)1-αexp(-kΔ0t).
In order to extract the parameters k and Y∞, we minimize the square function Q2(k) for this second-order rate constant with respect to k given as
(68)Q2(k)=∑i=1N(Yexp(ti)-Yth(ti,k))2,
where the summation is over the N experimental ti coordinates. Then the solution to the minimization problem is when the corresponding R(k) function (10) is zero. The NR method was used to solve R(k)=0 with the error tolerance of 1.0×10-10. With the same notation as in the first-order case, the second order results are kb=938.0±18(Ms)-1; Y∞=0.0245±0.003; and fd=9.606×10-4.
The experimental estimates from the conventional methods are [32, page 25]: kb=949.0±22(Ms)-1; Y∞=0.025±0.003.
Again the two results are in close agreement. The graph of the experimental curve and the one that is derived from our optimization method are given in Figure 4.
Graph of the experimental and calculated curve based on the current induced parameter-dependent optimization method.
6. Conclusions
The triad of associated implicit function optimization covers both the topics of modeling of data and the optimization of arbitrary functions where experimental or theoretical considerations require that a single variable is tagged to a process variable that is iteratively relaxing to an equilibrium stationary point. Applying method (i)a to chemical kinetics allows for the direct determination of parameters that is not possible by application of the standard methodologies. The results presented here show that for linked variables, it is possible to derive all the parameters associated with a curve by considering only one independent variable which serves as the independent variable for other functions in the optimization process as illustrated by methods (i)a,b. Apart from possible reduced errors in the computations, it might also be a more accurate way of deriving parameters that are more influenced or conditioned (on physical grounds) by the value of one parameter (such as k here) than others; the current methods that give equal weight to all the variables might in some cases lead to results that would be considered “unphysical.” In complex dynamical systems with multiprocesses, the physical considerations are such that for scientific purposes, it would be advantageous if optimization would be conducted on just one primary coordinate variable, such as in attempting to derive the most general stable conformer in a large molecule, where there are thousands of local minima present if all free coordinate variables are considered [9, Section 6.7, page 330]. For such systems, method (ii) might be applicable. This generalized potential surface might be found suitable for reaction trajectory calculations [9, Chapter 4, page 192 on “Features of a landscape”] that require a single path variable, where the general optimized conformer would be relevant to the study of the potential surfaces and force fields present.
List of Variables(P,t,k):
the entire domain space of a function such as f(P,t,k) where k is the variable associated with a sequence of measurements, such as along the time coordinate. The vector P with components Pi is the normal parameter that must be optimized, and k is the specially chosen variable on experimental grounds that is optimized whilst constructing functions such that P=P(k) (6)
f(P,t,k):
refers to a function that is proposed to be a “law of nature” whose parameters {P,k} are to be optimized (7)
Ylaw(t,k):
Ylaw(t,k)=f(P,t,k) (5)
Yth(t,k):
the theoretical law of nature if P=P¯(k) is determined. That is, Yth(t,k)=f(P¯,t,k) (6)
Yexp(ti):
an experimentally determined datapoint that ideally represents the range of f(P,t,k) if there was no error; that is, Yexp(ti)=f(P′,t,k′) for a perfect fit for all ti, for fixed (P′,k′) (7)
P¯i(k):
an averaged value for Pi(k) based on some specified algorithm (8)
QX,Q(k):
least squares (LS) function to optimize Ylaw(t,k) in the case of Q (k) (9). In general, all QX functions are LS functions specified by X (e.g., (9), (11), (42), etc.)
QE:
General cost or object function to be optimized, not necessarily in LS form (50)
R(k):
=Q′(k) (10)
λi:
Lagrange multipliers associated with the QX optimization (14)
L:
Lagrangian to the optimization problem (17)
kq:
chemical kinetics rate constant for reaction q (e.g., (56))
λ(t),λ∞:
absorbance measurements for first-order chemical kinetics reactions at time t and infinity (e.g., (58) and (59))
Y(t),Y∞:
absorbance measurements for second-order chemical kinetics reactions at time t and infinity (e.g., (64) and (66)).
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by University of Malaya Grant UMRG(RG077/09AFR) and Malaysian Government grant FRGS(FP084/2010A). This work was initiated and completed during a Sabbatical research visit (2012-2013) to the Atomistic Simulation Centre (ASC), School of Mathematics and Physics, Queen’s University Belfast. I thank Ruth Lynden-Bell (Chemistry Department, Cambridge University) for facilitating this visit. Cordial discussions concerning real world applications with faculty at ASC are gratefully acknowledged. I thank my hosts, Jorge Kohanoff (ASC) and Christopher Hardacre (Chem. Dept., QUB) for congenial hospitality during this time.
CravenB. D.1981London, UKChapman & HallMR636505DePreeJ.SwartzC.1988New York, NY, USAJohn Wiley & SonsMR1042294ApostolT. M.20022ndNew Delhi, IndiaNarosa Publishing HousePressW. H.TeukolskyS. A.VetterlingW. T.FlanneryB. P.20073rdCambridge, UKCambridge University PressMR2371990SnymanJ. A.2005New York, NY, USASpringerMR2120543DavidonW. C.Variable metric method for minimization19911111710.1137/0801001MR1094786ZBL0752.90062BroydenC. G.The convergence of a class of double-rank minimization algorithms197067690BanerjeeA.AdamsN.SimonsJ.ShepardR.Search for stationary points on surfaces198589152572-s2.0-0039047376WalesD. A.2003Cambridge, UKCambridge University PressCambridge Molecular Scienceedited by R. Saykally, A. Zewail and D. KingKirkpatrickS.GelattC. D.VecchiM. P.Optimization by simulated annealing1983220459867168010.1126/science.220.4598.671MR702485ZBL1225.90162BatesD. M.WattsD. G.1988New York, NY, USAJohn Wiley & SonsWiley Series in Probability and Mathematical Statistics10.1002/9780470316757MR1060528GolubG. H.PereyraV.The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate197310413432MR0336980ZBL0258.65045van StokkumI. H. M.LarsenD. S.van GrondelleR.Global and target analysis of time-resolved spectra200416572-3821042-s2.0-324266262810.1016/j.bbabio.2004.04.011ShearerP.GilbertA. C.A generalization of variable elimination for separable inverse problems beyond least squares20132942704500310.1088/0266-5611/29/4/045003MR3041541ZBL1276.65035GouldN. I. M.LohY.RobinsonD. P.A filter method with unified step computation for nonlinear optimization201424117520910.1137/130920599MR3158793MilzarekA.UlbrichM.A semismooth Newton method with multidimensional filter globalization for l1-optimization201424129833310.1137/120892167MR3166973GomesH. M.Truss optimization with dynamic constraints using a particle swarm algorithm20113819579682-s2.0-7795661844310.1016/j.eswa.2010.07.086GendreauM.PotvinJ.-Y.20101462ndNew York, NY, USASpringerInternational Series in Operations Research & Management ScienceVaradhanR.GilbertP. D.BB: an R package for solving a large system of nonlinear equations and for optimizing a high-dimensional nonlinear objective function20093241262-s2.0-70350426511RauppF. M. P.DrummondL. M. G.SvaiterB. F.A quadratically convergent Newton method for vector optimization201463566167710.1080/02331934.2012.693082MR3196001Al-KhaleelM. D.GanderM. J.RuehliA. E.Optimization of transmission conditions in waveform relaxation techniques for RC circuits20145221076110110.1137/110854187MR3198601AmirteimooriA.DespotisD. K.KordrostamiS.Variables reduction in data envelopment analysis201463573574510.1080/02331934.2012.684354MR3196005SolarS.SolarW.GetoffN.A pulse radiolysis-computer simulation method for resolving of complex kinetics and spectra1983211-21291382-s2.0-0020338868WazwazA.-M.A note on using Adomian decomposition method for solving boundary value problems200013549349810.1023/A:1007888917365MR1794863HouserJ. J.Estimation of A∞ in reaction-rate studies19825997767772-s2.0-33845553969MooreP.Analysis of kinetic data for a first-order reaction with unknown initial and final readings by the method of non-linear least squares197268189018932-s2.0-034260930510.1039/F19726801890WentworthW. E.Rigorous least squares adjustment: application to some non-linear equations, I1965422961032-s2.0-33947481699WentworthW. E.Rigorous least squares adjustment: application to some non-linear equations, II19654231621672-s2.0-33947490736JesudasonC. G.The form of the rate constant for elementary reactions at equilibrium from MD: framework and proposals for thermokinetics2008433976102310.1007/s10910-007-9320-0MR2386546ZBL1152.92034KhanM. N.2007133Boca Raton, Fla, USATaylor & FrancisSurfactant Science Seriesedited by A. T. HubbardNewtonT. W.BakerF. B.The kinetics of the reaction between plutonium(VI) and iron(II)1963677142514322-s2.0-33947473649EspensonJ. H.19951022ndSingaporeMcGraw-Hill