Multiple linear regression analysis is widely used to link an outcome with predictors for better understanding of the behaviour of the outcome of interest. Usually, under the assumption that the errors follow a normal distribution, the coefficients of the model are estimated by minimizing the sum of squared deviations. A new approach based on maximum likelihood estimation is proposed for finding the coefficients on linear models with two predictors without any constrictive assumptions on the distribution of the errors. The algorithm was developed, implemented, and tested as proof-of-concept using fourteen sets of compounds by investigating the link between activity/property (as outcome) and structural feature information incorporated by molecular descriptors (as predictors). The results on real data demonstrated that in all investigated cases the power of the error is significantly different by the convenient value of two when the Gauss-Laplace distribution was used to relax the constrictive assumption of the normal distribution of the error. Therefore, the Gauss-Laplace distribution of the error could not be rejected while the hypothesis that the power of the error from Gauss-Laplace distribution is normal distributed also failed to be rejected.
1. Introduction
The first report on multiple linear regression appears on 1885 [1] and was detailed in 1886 [2]. The classical treatments of the multiple regressions were built on the product-moment method implemented in 1846 [3] and later connected with the optimal correlation [4].
In his first published paper, Fisher introduces the method of likelihood maximization [5], later used in conjunction with Pearson’s correlation [6]—a paper which started a contradictory debate between the method of central moments and the method of likelihood estimation [7] replied to in [8] and finally linked with the partial correlation coefficients [9].
A multiple linear regression model involves more than two variables, one (y) being assumed dependent and the others (x1,x2,…,xm) being assumed to be independent, and is considered here as a continuation of a previous study [10]. The most important assumption is that the data are paired; for example, a natural association between the values of the variables exists. This kind of association is accomplished when for instance a multiple linear regression is constructed involving a measured property/activity (y) for a series of compounds for which other compounds measured properties/activities or structure-based descriptors are available (x1,x2,…,xm), the natural association being in this case the (chemical) compound responsible for that property/activity/descriptor value.
The least squares method is the standard approach for regression analysis, the method being credited to Legendre [11] (for a debate about the inventor, please see [12]), which also (implicitly) assumes that the error is normally distributed.
Iteratively applying local quadratic approximation to the likelihood (through the Fisher information [13]), the least squares method was used to fit a generalized linear model as a way of unifying classical, logistic, and Poisson (linear) regression in [14] by iteratively reweighing the least squares method in the way to the maximum likelihood estimation of the model parameters.
Generalized Gauss-Laplace distribution is the natural extension [15] from Gauss’s [16] and Laplace’s [17] symmetric distributions. It is a triparametric distribution (location, scale, and shape) and parameter estimation via maximum likelihood and the method of moments have been reported in [18], concluding that the estimates do not have a closed form and must be obtained numerically.
A more general result regarding the maximum likelihood estimation can be found in [19] but unfortunately provides only conditions in which maximum likelihood estimate exists and is unique without providing the reciprocal (namely, there exist also other conditions than the one given, in which maximum likelihood estimate exists and is unique). Even more, for numerical estimates, it is hardly to discuss unicity.
The problem of estimating the parameters of a multiple linear regression under assumption of generalized Gauss-Laplace distribution of the error is a hard problem which can be solved only numerically and it involves an optimization problem with m+3 constrains, where m is the number of unknown (to be determined) coefficients of the multiple linear regression. In this paper a mathematical and a numerical treatment of the problem is proposed.
In order to provide a proof of the facts for the proposed method of relaxing the distribution of the error when linear regression is used to link between chemical information and biological measurements, ten previously reported datasets were considered, all with significant role in human medicine or ecology.
2. Mathematical Treatment
One may define the generalized Gauss-Laplace (GL) distribution as(1)GLx;μ,σ,q=q2σΓ1/23/qΓ3/21/qexp-x-μ/σqΓ1/q/Γ3/qq/2,where Γ(·) is the Gamma function and μ (location), σ (scale), and q (shape) are the parameters of the distribution.
This definition will be used here for the Gauss-Laplace distribution to relax the normal distributed constraint for the distribution of the error (ε).
2.1. Statement of the Problem
Multiple linear regressions under assumption of GL distribution (see (1), for the error ε; aj1≤j≤m, σ and q are to be determined from sampled data) are stated in the following equation:(2)MLRGLε;σ,q=GLε;aj1≤j≤m,σ,q,where ε=y-∑1≤j≤majxj (and ε^≈0 and ε-=0).
The case with intercept (y≈y^=a0+∑1≤j≤majxj) is reduced to the case without intercept by increasing with one the number of the independent variables (am+1←a0; xm+1←1; and m←m+1, when y≈y^=∑1≤j≤m+1ajxj) and therefore will not be mentioned further. The substitution given as (2) transforms the distribution from univariate to a multivariate one and can be mathematically characterized by a series of properties, such as is given in [29] (results applicable resizing x from 0 and x0←y).
Let us take a sample of n paired measurements (e.g., yi,xj,i1≤j≤m1≤i≤n, where n is the number of paired measurements and m is the number of independent measures). The likelihood for the sample is(3)LMLRGL·=∑i=1nlnMLRGL·.
Doing the substitution given in (2) and expressing in full its parameters, the expression of the likelihood function from (3) becomes(4)LMLRGLaj1≤j≤m,σ,q=n·lnq2σΓ1/23/qΓ3/21/q-σqΓ1/qΓ3/qq/2∑i=1nyi-∑1≤j≤majxj,iq.
The likelihood is at maximum when all its partial derivatives are zero:(5)0=∂∂a1LMLRGLaj1≤j≤m,σ,q=⋯=∂∂qLMLRGLaj1≤j≤m,σ,q.
2.2. Simplification of the Problem
The problem of finding the maximum of the likelihood is a typical problem of finding the extreme points, but not easy to be solved because it depends on a large number of variables. The easiest way is to eliminate one variable, namely, σ. The derivative of LMLRGL by σ provides the value of σ:(6)n=q·Sσq·Γ1/q/Γ3/qq/2,Sq,aj1≤j≤m=∑i=1nyi-∑1≤j≤majxj,iq.
Please note that S=S(q,aj1≤j≤m) does not depend on σ. Therefore, let LMLRGLS(S,n,q) be the function having this constraint. After some calculations, the expression of LMLRGLS(S,n,q) is(7)LMLRGLS,n,q=nqq·lnn1/qq1-1/q2S1/q·Γ1/q-1(8)S=Sq,aj1≤j≤m=∑i=1nyi-∑j=1majxj,iq=∑i=1nTi,aj1≤j≤mq/2,where Ti,aj1≤j≤m=yi-∑1≤j≤majxj,i2.
On the other hand, only S depends on aj1≤j≤m, and therefore when the derivative of S relative to a1,…,am is zero, then also the derivative of the maximum likelihood estimation function (either any of LMLRGL and LMLRGLS) is zero. Doing the partial derivatives of S, with the above given substitution (function T), the following equation is the result:(9)∑i=1nyixu,iWi,q=∑j=1maj·∑i=1nxu,ixj,iWi,q;for 1≤u≤m,where W(i,q)=yi-∑1≤j≤majxj,iq-2.
At this point only the expression of the likelihood function (see (7) and (8)) must be included in the new statement of the problem (see (9)) in order to keep in full the derivatives constraints of the initial problem (see (4) and (5)). There is no obvious further reduction of the problem. However, revising the results obtained till this point, the cancelling of the (likelihood function) derivative relative to σ was included at the beginning of the simplification (see (6)) while the cancelling of the (likelihood function) derivatives relative to the regression coefficients a1,…,am is equivalent to the previous given equation for the regression coefficients (see (9)). On the other hand, (7)–(9) facilitate an iterative solution of the problem.
3. Fixed-Point Theory for Iterating to the Optimal Solution
A convenient notation was used in (9) to suggest the further treatment of the problem. Actually, Fisher and Mackenzie proposed for the first time to use such numerical treatment in statistics [30]. This is based on the assumption that, near to the optimal solution, an iterative evaluation of the coefficients (here q and aj1≤j≤m) conducted using their previous values (hidden in (9) inside of the function W(i,q)) leads to the optimum. The optimum is obtained when no significant change from one step to another is on their values, and, at that time, the W(i,q) function acts as the argument of a contraction mapping [31].
There are some inconveniences for a smooth application of the fixed-point theory. One of them is that the obtaining of the maximum of the LMLRGLS function (see (7), being obtained for known values of S and n (where ∂/∂qLMLRGLS(S,n,q) vanishes)) is not a very simple problem; it is expected from its explicit expression to have more than one local maximum. Fortunately, some clues exist, such as the domain of q (ranging from 0) and the expectance from the power of the error (here let us say that it is expected to have q from 0.1 to 10 and is very unlikely but possible to have q from 0.01 to 100, but outside of this range also precision of computations often fails). But the biggest inconvenience is that (9) is not an equation, but a system of equations, and here we may only provide different strategies of iteration, hoping that at least one of them provides the contraction mapping. Namely, we may
start from some initial values of the regression coefficients (aj,01≤j≤m) and for the power of the error q0;
use initial values to obtain the likelihood function LMLRGLS (from (7)) as a function depending only on q; it requires only the evaluation of S (see (8));
find the maximum (let this be ϑ) of LMLRGLS function from (7) (where its derivative is 0 and the point is a global extreme point);
prepare starting of a loop on k, by setting it to 0 (k←0);
it is possible, especially at the beginning of the iteration (when k=0), that ϑ and qk be largely different one to each other; a major change in q will accelerate the convergence but will also increase the likelihood of divergence; therefore, use the new (ϑ) and old (qk) value to indicate the gradient of the change in q, such as qk+1←δ·qk+(1-δ)·ϑ, with small 0<δ<1 to be determined;
do a loop (k←k+1) using the new value of q (namely, qk+1) to calculate the new values of the coefficients (aj,k+11≤j≤m with (9)) and using the new values of the coefficients calculate the new value of q (turn back to find the maximum from (7));
repeat until (aj,k+11≤j≤m) and (qk+1) have no changes during iteration.
At arriving in the stationary point, all criteria for the maximization of the likelihood are accomplished; namely, the equations corresponding to all derivatives cancellation are assured. The great advantage of this proposed method is that it reduces the problem of finding the maximum of a function with m+3 variables to the finding of the maximum of a function with one variable (q), in a repeated process, of course.
The disadvantage is that the evolution is through a contracting functional of which contraction cannot be assured all the time. This is the reason why there are different strategies of finding such kind of contracting functional (see example 6.1 in [32] for construction of a contracting functional from resampling).
Some calculations are the same regardless the strategy used and are given in the next as Algorithms 1–3.
Algorithm 1: Calculate “S” at some step “j” from (8).
One strategy is to use the equations from cancellation of regression coefficients derivatives (see (9)) to iterate the values of the coefficients, while another one is to treat (9) as a system of equations and to solve it as a whole (Algorithm 4).
Algorithm 4: Block solves providing “au,j+11≤u≤m” at some step “j” with (9).
Require: Wi1≤i≤nWi1≤i≤n iterated at step j by Algorithm 3
au,j+11≤u≤m←au1≤u≤mSolution of linear & homogeneous system Eq. (9) from
∑i=1nyixu,iWi,j=∑k=1mαk∑i=1nxk,ixu,iWi,j1≤u≤m
Another strategy that is required to be specified is that if (7)–(9) is used to simply iterate for new values or if (9) should be used in a loop to converge to new estimates for the coefficients associated with the new q (Algorithm 5). The expected assumption is that the errors are normally distributed (q=2) and the optimal solution of (5) is near to this.
Algorithm 5: Double loop with (9) for (7) and (8).
Require: q0, ak,01≤k≤mCoefficients iterated at step 0
j← 0
repeat
j←0+1
Sj← Algorithm 1
qj← Algorithm 2
repeat
Wi1≤i≤n← Algorithm 3
au,j+11≤u≤m← Algorithm 4
until ∑1≤k≤mak,j+1/ak,j-1<ε
until qj+1/qj-1+∑1≤k≤mak,j+1/ak,j-1-1<ε
The contingency of 2 × 2 strategies given above was tested on sampled data (see Section 4), and the pair (Algorithms 4 and 5) turned out to be the only one providing a contraction functional. Thus, for convenience, the working algorithm is given in full (see Algorithm 6) and was used to obtain the results given in the next section.
Algorithm 6: Contraction functional for MLR-MLE-GL.
In order to assure the numerical stability of the calculations, Algorithm 6 was used with fixed and reasonable value ε=10-5, and in order to assure a smooth convergence, the value of the new error’s power estimate (qj) was replaced by an exponential smoothing (a technique commonly applied to time series [33]), qj←0.1·qj+0.9·qj-1.
Therefore, in all scenarios, the initial (starting) values of the estimates to be determined will be the one given by the classical multiple linear regression models as presented in the following:(10)q0⟵2;ak,01≤k≤m⟵MLRy,x1,…,xk,MLRy,x1,…,xk=xTx-1xTy,where MLR(y,x1,…,xk) uses the classical strategy of ordinary least squares (xTx-1xTy) to find the parameters.
4. Case Study
Ten datasets of chemical compounds with different sample size (Table 1) along with their measured outcome activity were considered to illustrate Algorithms 1–6.
Datasets characteristics.
Set
Sample size (n)
Class
Property/activity
Reference
1
132
Estrogens
Estrogen binding affinity—logRBA
[20]
2
37
Carboquinone derivatives
Minimum effective dose (MED)—log(1/MED)
[21]
3
33
Organic pollutants
Oxidative degradation—log(k′)
[22]
4
97
Benzotriazoles
Fish toxicity—pEC50
[23]
5
136
Thiophene and imidazopyridine derivatives
Inhibition of polo-like kinase 1—pIC50
[24]
6
14
Substituted phenylaminoethanones
Average antimicrobial activity—pMICam
[25]
7
110
Acetylcholinesterase inhibitors
Inhibition activity—pIC50
[26]
8
107
Polychlorinated biphenyl ethers
298 K supercooled liquid vapor pressures—log(pL)
[27]
9
107
Polychlorinated biphenyl ethers
Aqueous solubility—log(Sw,L)
[27]
10
47
Para-substituted aromatic sulphonamides
Carbonic anhydrase II inhibitors—log(Ki)
[28]
For all datasets, the experimental values of the dependent variable (y) and for the selected previously reported independent variables (under the assumption of the normal distribution of the error) on multiple linear regressions with two (m=2) independent variables are given in Table 2.
Reported bivariate models.
Set
Model under assumption of normal errors
Determination coefficient (r2)
1
-4.284-0.0263⋅TIE+0.0368⋅TIC1
0.3976
2
7.780-579⋅IHDMkMg+0.049⋅IHDDFMg
0.7700
3
-2.703+0.00515⋅SAG+9.703⋅f0n
0.6859
4
4.110-0.0172⋅TPSA(NO)+0.0097⋅Aeigm
0.7161
5
2.5651+0.1899⋅RDF035m+2.9825·Small-RSI-mol
0.5101
6
0.780+0.0339⋅χv0+0.004⋅μ
0.8357
7
5.446+0.716⋅nR10+1.113⋅N-070
0.6838
8
1.476-0.588⋅NCl-5.029⋅10-2⋅Vs+
0.9880
9
-4.080-0.880⋅NCl+5.996·σ2tot
0.9619
10
4.055-0.154⋅χv0-1.284⋅FNSA1
0.7058
Different descriptors (independent variables) were used to explain the activity/property of interest on models presented in Table 2. The names of these descriptors are
TIE: state topological parameter [20];
TIC1: total information content index (neighborhood symmetry of 1) [20];
IHDMkMg and IHDDFMg: MDF descriptors [21];
SAG: molecular surface area grid; f0n: Fukui index [22]
TPSA(NO): topological polar surface area expressed by nitrogen and oxygen contributions; Aeigm: Dragon descriptor [23];
RDF035m: radial distribution function on a spherical volume of a 3.5 Å radius weighted by atomic mass; small-RSI-mol: the smallest value of atomic steric influence in a molecule [24];
χv0: Kier’s molecular connectivity index; μ: dipole moment [25];
nR10: number of 10-membered rings; N-070: number of Ar-NH-Al fragments [26];
NCl: the number of the chlorine atoms on the two phenyl rings; Vs+: the surface maxima values of the electrostatic potential; and σ2tot: total variance of the electrostatic potential at a point ri [27];
FNSA1: fractional partial positive surface area 1 PPSA1/TMAS; where PPSA = Partial Positive Surface Area and TMSA = Total Molecular Surface Area.
All sets subjected to analysis converged maximizing the likelihood and Table 3 provides the differences between values obtained by classical MLR approach and values obtained by the proposed approach (MLR-MLE-GL).
Differences between values of coefficients obtained by classical linear regression approach compared to the proposed approach.
Set
diff(q)
diffa0
diffa1
diffa2
diff(LMLRGL)
diff(σ)
1
0.3400
-0.00073
-0.00315
0.24400
-0.30000
-0.00200
2
-0.4150
-0.00034
-16.30000
0.17400
-0.10100
-0.00020
3
-0.3830
-0.28700
0.00009
-0.04000
-0.06000
-0.00030
4
-0.1680
0.00006
0.00007
-0.01400
-0.05000
0.00000
5
0.9420
0.34500
-0.00880
-0.62400
-6.10000
-0.00850
6
0.5000
0.00027
0.00078
-0.02140
-0.09000
-0.00006
7
0.5210
-0.10300
0.03490
-0.01800
-1.10000
0.00030
8
-0.5690
0.00090
-0.00330
-0.01100
-0.42000
-0.00010
9
-0.4370
-0.27700
-0.00020
0.04000
-0.30000
-0.00020
10
-0.9370
0.01400
-0.00700
0.06000
-0.70600
0.49310
diff: difference between value obtained by classical approach and value obtained by the proposed approach.
a0, a1, and a2: coefficients of the independent variables; q: power of the error (Algorithm 6 for the proposed approach).
σ: population standard deviation; LMLRGL: likelihood for multiple linear regressions under assumption of GL distribution.
The results presented in Table 3 reveal different estimates for the coefficients in the assumption of the more general generalized Gauss-Laplace distribution of the error. In 6 out of 10 cases, the power of the error proved to be higher compared with convenient value of 2, the highest values being observed for set 10 (q=2.937). Opposite, the power of the error proved to be almost half of the expected value for set 5 (q=1.058). The values of coefficients obtained by applying the MLE and the proposed approach were close to each other in two cases (set 4 and set 8). With one exception, represented by set 2, the sum of the absolute differences of a0, a1, and a2 was less than 1. The values obtained for the population standard deviation by the two investigated methods proved to be closest to each other, with highest difference of 0.49310 observed on set 10.
The power of the error follows different patterns according to the model, decrease-fluctuation-plateau (set 1, Figure 1(a)), decrease-increase in steps-plateau (set 6, Figure 1(b)), increase-fluctuation-decrease in steps-plateau (set 8, Figure 1(c)), and decrease-fluctuation (set 5, Figure 1(d)).
Evolution of the power of the errors (q) by optimization iteration: (a) set 1 (converged at 226); (b) set 6 (converged at 154); (c) set 8 (converged at 83); and (d) set 5 (converged at 784).
A question (hypothesis) can be raised about the power of the error: if its distribution can be assumed normal. This hypothesis (the distribution of the power of the error can be assumed to be normal) can be tested on the results even if the sample is small (10 cases) to provide an answer. However, the tendency to have a mean of two in convergence is clear (q-=2.06 from the 10 cases) and the hypothesis of its normality cannot be rejected (Anderson-Darling statistic measures that only 14.72% (pto-reject=0.8528>0.05) of the random samples are in better agreement with the normal distribution while Kolmogorov-Smirnov statistic measures only 28.7% (pto-reject=0.713>0.05)).
5. Conclusions
The proposed algorithm (Algorithm 6 in this paper) was found to provide an appropriate contraction mapping to be used for maximum likelihood estimation of the multiple linear regression parameters in the generalized Gauss-Laplace distribution assumption of the measurement’s errors. The analysis conducted on 10 samples demonstrated that, in general, it is not appropriate to assume that the measurement error is normally distributed, and when it is possible a deeper treatment of the distribution of the error need to be conducted. From a sample of 10 cases, the analysis of the distribution of the error showed that the normal distribution of the power of the error could not be rejected, being very likely to have a mean equal to two.
Competing Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
GaltonF.SectionH.The british association reports18853250251010.1038/032502a0GaltonF.Regression towards mediocrity in hereditary stature18861524626310.2307/2841583BravaisA.Analyse mathematique sur les probabilites des erreurs de situation dun point18469255332PearsonK.Mathematical contributions to the theory of evolution. III. Regression, heredity, and panmixia189618725331810.1098/rsta.1896.0007FisherR. A.On an absolute criterion for fitting frequency curves191241155160JFM43.0302.01FisherR. A.Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population191510450752110.2307/2331838SoperH. E.YoungA. W.CaveB. M.LeeA.PearsonK.On the distribution of the correlation coefficient in small samples. appendix II to the papers of ‘student’ and R. A. Fisher191711432841310.1093/biomet/11.4.328FisherR. A.On the ‘probable error’ of a coefficient of correlation deduced from a small sample19211332FisherR. A.The distribution of the partial correlation coefficient19243329332JäntschiL.PruteanuL. L.CozmaA. C.BolboacăS. D.Inside of the linear relation between dependent and independent variables201520151136075210.1155/2015/3607522-s2.0-84935888924LegendreA. M.1805Paris, FranceF. DidotStiglerS. M.Gauss and the invention of least squares19819346547410.1214/aos/1176345451MR615423FisherR. A.Theory of statistical estimation192522570072510.1017/s0305004100009580NelderJ. A.WedderburnR. W.Generalized linear models1972135337038410.2307/2344614JäntschiL.BolboacăS. D.Observation vs. observable: maximum likelihood estimations according to the assumption of generalized gauss and laplace distributions2009815811042-s2.0-77951154446GaussC. F.1809Hamburg, GermanyPerthes et Besser(translated in 1857 as: Theory of Motion of the Heavenly Bodies Moving about the Sun in Conic Sections (trans. C. H. Davis), Little Brown, Boston, Mass, USA and reprinted in 1963, Dover, New York, NY, USALaplaceP. S.1812Paris, FranceCourcierVaranasiM. K.AazhangB.Parametric generalized Gaussian density estimation19898641404141510.1121/1.398700MakelainenT.SchmidtK.StyanG.On the existence and uniqueness of the maximum likelihood estimate of a vector-valued parameter in fixed-size samples19819475876710.1214/aos/1176345516MR619279LiJ.GramaticaP.The importance of molecular structures, endpoints' values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders20101446876961992145210.1007/s11030-009-9212-22-s2.0-7865017525419921452BolboacǎS. D.JäntschiL.Comparison of quantitative structure-activity relationship model performances on carboquinone derivatives20099114811661983860110.1100/tsw.2009.1312-s2.0-7764923970419838601JiaL.ShenZ.GuoW.ZhangY.ZhuH.JiW.FanM.QSAR models for oxidative degradation of organic pollutants in the Fenton process20154614014710.1016/j.jtice.2014.09.0142-s2.0-84920393824CassaniS.KovarichS.PapaE.RoyP. P.van der WalL.GramaticaP.Daphnia and fish toxicity of (benzo)triazoles: validated QSAR models, and interspecies quantitative activity-activity modelling2013258-259506010.1016/j.jhazmat.2013.04.0252-s2.0-84878140376ComelliN. C.DuchowiczP. R.CastroE. A.QSAR models for thiophene and imidazopyridine derivatives inhibitors of the Polo-Like Kinase 12014621711792490973010.1016/j.ejps.2014.05.0292-s2.0-8490327361124909730VermaD.KumarP.NarasimhanB.RamasamyK.ManiV.MishraR. K.MajeedA. B. A.Synthesis, antimicrobial, anticancer and QSAR studies of 1-[4-(substituted phenyl)-2-(substituted phenyl azomethyl)-benzo[b]-[1,4]diazepin-1-yl]-2-substituted phenylaminoethanones201510.1016/j.arabjc.2015.06.0102-s2.0-84934324698Vitorović-TodorovićM. D.CvijetićI. N.JuranićI. O.DrakulićB. J.The 3D-QSAR study of 110 diverse, dual binding, acetylcholinesterase inhibitors based on alignment independent descriptors (GRIND-2). the effects of conformation on predictive power and interpretability of the models2012381942102307322210.1016/j.jmgm.2012.08.0012-s2.0-8486740590423073222Hui-YingX.Jian-WeiZ.Gui-XiangH.WeiW.QSPR/QSAR models for prediction of the physico-chemical properties and biological activity of polychlorinated diphenyl ethers (PCDEs)20108066656702048850410.1016/j.chemosphere.2010.04.0502-s2.0-7795380965220488504SinghJ.ShaikB.SinghS.AgrawalV. K.KhadikarP. V.DeebO.SupuranC. T.Comparative QSAR study on para-substituted aromatic sulphonamides as CAII inhibitors: information versus topological (distance-based and connectivity) indices200871324425910.1111/j.1747-0285.2007.00625.x2-s2.0-38849136034SinzF.GerwinnS.BethgeM.Characterization of the p-generalized normal distribution2009100581782010.1016/j.jmva.2008.07.006MR24987152-s2.0-60549098424FisherR. A.MackenzieW. A.Studies in crop variation. II. The manurial response of different potato varieties192313331132010.1017/s0021859600003592RusI. A.PetruşelA.ŞerbanM. A.Weakly Picard operators: equivalent definitions, applications and open problems200671322MR2242312KlaassenC. A.PutterH.Efficient estimation of Banach parameters in semiparametric models200533130734610.1214/009053604000000913MR21578052-s2.0-18444394959JäntschiL.2010Cluj-Napoca, RomaniaUniversity of Agricultural Sciences and Veterinary MedicinePhD Advisor R. E. Sestraş (Romanian)