Robustness Analysis of an Outranking Model Parameters’ Elicitation Method in the Presence of Noisy Examples

. One of the main concerns in Multicriteria Decision Aid (MCDA) is robustness analysis. Some of the most important approaches to model decision maker preferences are based on fuzzy outranking models whose parameters (e.g., weights and veto thresholds) must beelicited.Theso-calledpreference-disaggregationanalysis(PDA)hasbeensuccessfullycarriedoutbymeansofmetaheuristics,but thiskindofworkslacksarobustnessanalysis.Basedontheabove,thepresentresearchstudiestherobustnessofaPDAmetaheuristic methodtoestimatemodelparametersofanoutranking-basedrelationalsystemofpreferences.Themethodisconsideredrobust ifthesolutionsobtainedinthepresenceofnoisecanmaintainthesameperformanceinpredictingpreferencejudgmentsinanew referenceset.TheresearchshowsexperimentalevidencethatthePDAmethodkeepsthesameperformanceinsituationswithup to10%ofnoiselevel,makingitrobust.


Introduction
In Multicriteria Decision Aid (MCDA), one of its main concerns is the robustness of methods developed in this field.The term robust refers to the capacity for withstanding "vague approximations" and/or "zones of ignorance" to prevent the degradation of the properties that must be maintained [1].Having this idea in mind, it is important to depict how robust a new method in MCDA is.
A wide variety of problems in decision aiding often involve multiple objectives to be minimized or maximized simultaneously.Because of the conflicting nature of the criteria, it is not possible to obtain a single optimum, and consequently, the ideal solution to a multiobjective optimization problem (MOP) cannot be reached.Therefore, the analysts resort to approaches that can handle multiple criteria and at the same time can shrink the number of solutions they provide to those concerning specific interests of a decision maker.
Several approaches that solve MOPs are based on Multiobjective Evolutionary Algorithms (MOEAs) and assume a model of the decision makers' (DM) preferences.This work focuses on the preference model proposed by Fernandez et al. in 2011.The model uses fuzzy outranking relations to incorporate preferences in MOEAs, such as the strict, weak, and -preference.Also, it allows the mapping of a manyobjective problem into a surrogate problem with only three objectives.The method has been applied in a wide variety of problems, including the portfolio problems with many objectives and project partial support [2].
In order to apply the preference model of Fernandez et al. [3], the outranking model's parameters must be elicited, for example, weights and thresholds required by the index of credibility of the outranking, a cutting level, and some additional symmetric and asymmetric parameters.
Information about the model's parameters can be obtained either directly or indirectly.On the one hand, the direct eliciting method has been criticized by Marchant [4] and Pirlot [5] arguing that the only valid preference input information is that arising from the DM's preference judgments about actions or pairs of actions.As stated by Covantes et al. [6] and Doumpos et al. [7], these criticisms are even more significant in the frame of outranking methods, since the DM must set parameters that are very unfamiliar to her/him (e.g., veto thresholds).On the other hand, indirect elicitation methods use regression-inspired techniques for inferring the model's parameters from a set of decision examples [6,7].
In the frame of outranking methods, preferencedisaggregation analysis (PDA) approaches were pioneered by Mousseau and Slowinski [8].They proposed to infer the ELECTRE TRI model's parameters (except veto thresholds) from a set of assignment examples by using nonlinear programming.Mousseau et al. [9] proposed a method to infer the weights from assignment examples through linear programming.Ngo The and Mousseau [10] used assignment examples to elicit the boundary profiles in ELECTRE TRI.Methods dealing with indirect elicitation of weights under inconsistent sets of assignment examples have been addressed by Mousseau et al. [11,12].Dias et al. [13] integrate interactively the elicitation phase with a robustness analysis.
Most of the related papers elude the inference of veto thresholds because eliciting all the parameters simultaneously requires solving a very complex nonlinear programming problem.Two papers proposed the use of evolutionary algorithms (EAs) to infer the entire set of ELECTRE model's parameters from a set of assignment examples (cf.[7,14]).EAs are powerful tools for the treatment of nonlinearity and global optimization in polynomial time [15], as a more recent example, Álvarez et al. [16] used it to infer parameters that aid in the decision process at the collective level.
To the best of our knowledge, Fernandez et al. [17] was the first paper in which the reference information did not come from assignment examples, but from preference statements as " is at least as good as " and " is not at least as good as y."This paper infers the entire set of the ELECTRE III model's parameters and the generalized outranking model with reinforced preferences proposed by Roy and Słowiński [18].
Cruz-Reyes et al. [19] proposed a PDA method to infer the entire parameter set of the relational system of preferences from Fernandez et al. [3].This approach allows the introduction of the DM's preferential judgments through pairwise comparisons of different actions.However, this work lacks a robustness analysis which would allow measuring its capacity for withstanding vague approximations and/or zones of ignorance derived from its formal representation.
Based on the above, this research proposes a method for robustness analysis of the solutions offered by PDA methods based on metaheuristics.The study case is the Genetic Algorithm from the work of Cruz-Reyes et al. [19], which is used as a PDA method for the relational system of preferences proposed by Fernandez et al. [3].The method is considered robust if it maintains the same performance with or without noise in the reference set; otherwise, it can be concluded that the method provides sensitive solutions.As a result, the experimental design proves the method robustness by identifying that it estimates parameter value sets with a statistically nonsignificant difference when the noise levels are equal to or smaller than 10%.
Hence, the main contributions of this work are the proposed method for robustness analysis, and the noise model developed to introduce noise in a reference set.It is important to emphasize that both the method and model are the first in considering the full set of parameters in Fernandez et al. [3].Also, an important part of the contributions of this work is the identification of the zone in the parameter space and the level of noise where the response sets are most compatible with the DM's preferences.It needs to be noted that the noise concept is related to the inconsistencies, or errors, between the preference model and the DM.
Aside from this introduction, the paper is organized as follows.Section 2 presents the optimization approach to estimate parameter values which are subject to the robustness analysis, the associated surrogate model, and the elements required for its definition.Section 3 shows the method followed in this work to perform the robustness analysis.Sections 4 and 5 present the experimental design conducted to evaluate the robustness of the optimization approach and the results obtained from it.Finally, Section 6 brings some concluding remarks derived from the research.

Optimization Approach for Inferring the Model's Parameter Values
This section is organized as follows.Firstly, it gives the definition of the optimization problem used as inference approach for the estimation of parameter values.This is followed by the optimization approach used to solve the studied problem and the description of the metaheuristic used.Finally, it presents the method that served as a basis to perform the analysis of robustness of the optimization approach for inference of the parameter values.
2.1.The Inference Approach.The best compromise is a solution of a problem associated with the DM's preferences.As was stated by Branke et al. [21], there has been an increasing interest in incorporating the DM's preference information in the search process.This situation is due to its influence on the reduction of the cognitive effort to identify a solution that best matches those preferences and to reinforce the selective pressure toward the Pareto frontier, for example, Cruz et al. [22].A survey of strategies to incorporate preferences into multiobjective approaches can be found in [23,24].Particularly, this research deals with preference models based on outranking relations, such as the one developed by the works of Roy [20] and Fernandez et al. [3].In these works, the preference model approaches situations concerning the behavior of real DMs using a relational system of preferences.The six binary relations that lie on that system are indifference, strict preference, weak preference, incomparability, -preference, and nonpreference.These relations are associated with the predicate "the DM considers that option  is at least as good as " through a degree of truth (, ) in [0, 1].  1 shows each outranking relation and its notation in columns one and two, and the necessary conditions that need to be satisfied for each relation in column three.The parameters  used in the computation of the function (, , ) (cf.[19]), in combination with the credibility (), symmetry (), and asymmetry () thresholds, determine the preference relations.
An inference approach for the previous relational system would be a strategy that could properly estimate the values of the parameters (, , , ); that is, it locates the configuration of the parameters that minimizes the inconsistencies among the DM's preferences and those identified by the model.This problem can be formally defined as follows.Let  fr be the set of feasible parameter vectors (, , , ),  = {, , , , ∼, } the set of preference relations, and  a reference set formed by pairs (x, y) of possible alternatives with an associate relation xy reflecting the preference judgment from a DM.The ideal solution of the parameter elicitation problem would be ( 0 ,  0 ,  0 ,  0 ) ∈  fr such that the following equivalences are satisfied for all (x, y) ∈ : The best parameter setting should be the closest solution to the ideal one in the sense of certain acceptable metric.Following the works of Fernandez et al. [3,17], we propose a metric based on the so-called inconsistencies.Shortly, an inconsistency arises when an equivalence in (1a)-(1f) is not fulfilled.For example, given a DM's judgment in (x, y) ∈  and inferred parameter values (, , , ) ∈  fr , we call inconsistency with (1a) the fact that x is strictly preferred to y by the DM and however not (x(, , )y), or vice versa, x(, , )y but x is not strictly preferred to y by the DM.
Let  1 ,  2 , . . .,  6 be the number of inconsistencies in  with (1a) through (1f), respectively.Since the method by Fernandez et al. in 2011 gives priority to finding the nonstrictly outranked set,  1 is by far the most important measure.The information coming from  and  is also considered and turns out to be more important than that provided by I, R, and ∼.So, the best values for (, , , ) can be obtained from the solution of the following optimization problem: where  fr is the feasible region of parameter values,  1 =  1 ,  2 =  3 +  4 , and  3 =  2 +  5 +  6 , with preemptive priority favoring  1 over  2 and  2 over  3 .Note that, in this problem,  represents a preference judgment established by a DM, x( 0 ,  0 ,  0 ,  0 )y represents an ideal preference statement derived from the outranking relational system, and x(, , , )y represents a preference statement from estimated parameters, which is derived from the outranking relational system.This problem has been studied by Cruz-Reyes et al. [19], and the PDA method used to properly ＂ＩＯＨ＞ E1 ← ！ＦＡＩＬＣＮＢＧ(P, E 1 ) ＂ＩＯＨ＞ E2 ← ！ＦＡＩＬＣＮＢＧ(P, E 2 , ＂ＩＯＨ＞ E1 ) ( * ,  * ,  * ,  * ) ← ！ＦＡＩＬＣＮＢＧ(P, E 3 , ＂ＩＯＨ＞ E1 , ＂ＩＯＨ＞ E2 ) estimate the values of its parameters is described in the following section.

Optimization Approach
Based on a PDA Method.The multiobjective indirect elicitation method studied herein is the one proposed by Cruz-Reyes et al. [19].The method is based on monoobjective optimization algorithms that solve problem (2).It exploits the preemptive priority established for the problem and finds the best value for the first objective; then, this value is used as a bound when the second objective is minimized.Finally,  3 is minimized keeping the minimum values previously obtained for ( 1 ,  2 ).
Figure 1 depicts the general procedure followed by the approach.Each metaheuristic algorithm solves  = (DM sim , ), that is, an instance of problem (2) formed by the parameter values that define the model of preferences for a simulated DM, denoted as DM sim , and its reference set , trying to minimize only the objective  1 .The best value Best 1 is used as a bound in the next step of the method, where the same algorithm seeks the minimization of objective  2 while it validates that the solutions achieved a yield value on  1 equal to or smaller than Best 1 .Finally, in the last step, the algorithm now minimizes the third objective of problem (2), that is,  3 , while keeping the values for  1 and  2 not greater than Best 1 and Best 2 , respectively.The best solution ( * ,  * ,  * ,  * ) will arise from the parameter vectors obtained from the last step of this procedure.
The algorithm that was analyzed in this work was the Genetic Algorithm, the one with the best performance in the previous study.The evaluation function in each stage of the general procedure is related to objective   of problem (2).The encoding, the algorithms, and the fine-tuning process of the algorithms' parameters have been reported by Cruz-Reyes et al. [19].

Method for Robustness Analysis
Based on the work of [26], the concept of noise can be understood as random errors that are introduced in a sample.The presence of noise in an optimization process that seeks the adjustment of parameters of another strategy can affect the quality of its results.For example, a classification approach trained with an incorrect sample leads to misclassification of

Begin
Step 1.  small ← GenerateInstances() Step new outcomes, or a preference model with parameters estimated from a reference set containing erroneous preference judgments can derive inconsistencies with the DM's behavior.Because in many situations the presence of noise cannot be avoided, it is convenient to study the performance of methods that work over it and to identify how robust can they become in its presence.
This work uses the concept of noise to lead an investigation oriented to perform a robustness analysis of the strategy proposed by Cruz-Reyes et al. [19], summarized in the previous section.For this purpose, this section presents the general methodology to evaluate its quality, and details its components.The section is organized in three parts which involve (a) the presentation of the method; (b) the generation of random instances; (c) the model of noise for the instances.The definitions of the function that evaluates the quality of solutions of the optimization approach and the strategy followed to evaluate its robustness are given in the next section, related to the experimental design.
3.1.Description of the Method.Algorithm 1 depicts the method followed to analyze the robustness of the optimization approach described in Section 2. The method is developed in an ordered sequence of seven simple steps detailed in this section.
Firstly, in Step 1, the method randomly generates the set of instances  small , which is formed by small reference sets  associated with a DM.In this work, the DM will be represented by a simulated DM and will be denoted by  = (, , , ).The generation of  small is done through the instance generator.
In Step 2, after the construction of  small , the method solves it by using the Genetic Algorithm proposed by Cruz-Reyes et al. in 2017, as optimization approach (i.e., the optimization approach based on a PDA method).The obtained solutions will be the best sets of parameter values  * that characterize the inference approach defined in problem (2) (see Section 2); these solutions represent estimations of parameter values of the outranking model for each instance in  small .
Once the parameter values  * are computed, the method uses the noise model during Step 3 to introduce a small amount of noise in  small in order to produce a new set of instances   small .The noise is represented by a percentage of random modifications from correct preference statements toward wrong preference statements in   small .Then, in Step 4, the method, through the Genetic Algorithm, again infers the best sets of parameter values   of the outranking relations in the new set of instances   small .The performance evaluation in the method is done in Steps 5 and 6.In Step 5, the method randomly generates a set of instances  large formed by larger reference sets .In Step 6, it compares the quality of the solutions in  * and   that were previously obtained by the Genetic Algorithm, against the correct parameter values , that were used to produce the instance sets  small and   small .The indicators Δ * perf , Δ  perf obtained in this step measure the number of inconsistencies found in  large when using  * and   , respectively, as inferred parameters of the preference model and  as the real DM.Finally, the robustness analysis of the Genetic Algorithm is analyzed statistically in Step 7.This step evaluates if there is a significant difference between the performance of the approach when using the noisy instances   small and its performance when it uses instances without errors in the preference relations, that is,  small .The result on this step is the validity of the null hypothesis H 0 .
In summary, the seven steps of the method involve five main components: (1) the instance generator; (2) the optimization approach; (3) the noise model; (4) the performance evaluation of the optimizer; and (5) the robustness analysis.These components are detailed in the remainder of this section with the exception of the Genetic Algorithm used as optimization approach, whose description can be seen in the work of Cruz-Reyes et al. in 2017.

Instance Generation.
Given that this work studies the robustness of an optimization approach that characterizes DM sim , it requires sets of instances which can aid this purpose.Two types of sets were identified, the training set formed by small reference sets, denoted by  small , and the testing set formed by large reference sets, denoted by  large .A third set   small was created from  small by adding some noise to it; as a result, both were used to estimate the parameter values of the preference model.
Each instance is characterized by (i) a reference set  formed by a finite number of alternatives and (ii) a set of statements made over pairs (, ) ∈  × , which are given by the DM as his/her preference judgments.The process of construction is depicted as follows.
A reference set  in any instance of   has  alternatives with 10 objectives whose values are randomly generated in the range [1,10].Every pair of alternatives (, ) ∈  ×  has one preference statement defined by the DM.This statement is selected from  = {, , , , , ∼,  −1 ,  −1 ,  −1 }, that is, the preferences that the real DM establishes in the instance.
Particularly, the set of instances in this work are  10 and  20 , which use small reference sets with 10 and 20 alternatives, respectively.These small sets involve a manageable number of preference judgments for which a DM can still have time to define.The set of instances  100 uses large reference sets with 100 alternatives; this increased number of alternatives allows a better study of robustness of the inference approach.
The best set of solutions  * and   were contrasted in their ability to establish properly the preference relations.For this purpose, the 9900 preference judgments of each instance in  100 were used.The quality of  * and   was measured as the number of inconsistencies they had with respect to the real DM (the parameters ), in the same preference statements.The method followed for generating these instances is described below.

Noise Model: Incorporation of Noise into Instances.
This section describes the noise model, that is, the general procedure used to insert noise in the instances generated through the instance generation method.
Given an instance of   , a level of noise is introduced on it through incorrect pairwise statements , also denominated errors.For this purpose, the noise model uses three steps: (1) the specification of a percentage of error ; (2) the computation of the number of wrong statements; and (3) the modification of the statements in the instance.
In the first step, the model specifies the percentage  of preference statements that will be modified from the original reference set.These statements reflect the incorrect information present in the reference set .Let us note that the percentage values  considered in this work were taken from {0%, 5%, 10%, 20%, 50%}.
The second step of the model determines the amount of preference statements that are chosen to be incorrect.This value is denoted by  (  2 ), where (  2 ) is the maximum number of preferences defined with  alternatives, in the instance   , and  is the chosen percentage of such preferences that will be modified to represent errors.
Finally, in the third step, a total of  (  2 ) preference statements  are selected at random from .Then, they are modified accordingly to its original statement under the opposite rules applied to the specific relation  involved, which are shown with (3a) through (3e) as conditionals.

𝑥𝑃𝑦 󳨀→ 𝑥𝑄𝑦 ∨ 𝑥𝐼𝑦 ∨ 𝑥𝐾𝑦, (3a)
→  ∨ ∼, (3b) →  ∨ , (3d) The new set with the incorrect preference statements is denoted by    .In the previous rules, the left-hand side denotes the original judgment  in  ∈   , and the right-hand side represents the options for which they can be changed in    (where ∨ is the disjunctive operator).Let us point out that whenever a preference statement  has more than one option, it is chosen at random with equal probability.

Experimental Design
The method to analyze the robustness of the optimization approach based on PDA is presented in Section 3.This section details the experiment conducted to implement such method, whose content is organized as follows.The first part describes the indicator of quality used to evaluate the solution set offered by the Genetic Algorithm; this indicator is based on the inconsistencies produced by the approach using the estimated value parameters obtained from each reference set .The second part describes the robustness evaluation performed over the optimization approach; this part details the statistical analysis used to demonstrate whether there is significant difference between the performance of the optimization approach when using reference sets with and without noise.

Performance Evaluation.
The method in Section 3.1 generates the sets  * = ( * ,  * ,  * ,  * ) and   = (  ,   ,   ,   ).These are sets of solutions produced by the PDA strategy that solves the sets of instances  small and   small with and without noise, respectively.The quality of the sets  * and   is evaluated using the indicator   .This indicator measures the error in the prediction capacity of the sets as the differences in the estimation of preference statements.The details on this indicator are presented in the remainder of this section.
The calculation of the estimated error   requires  0 = ( 0 ,  0 ,  0 ,  0 ), that is, the parameter value settings of DM sim-parameters that were used in the generation of a reference set .The expected preference judgment ( 0 ) in  is compared for similarity against the estimated preference statements (  ), and ( * ).Then, the error is measured as the amount of inconsistent preference statements that each solution set  * and   had accumulated with respect to the ones defined by  0 .
Hence, the indicator   counts as an inconsistency each time that a strict preference ( 0 ) does not match the estimated relation ( * ) (or the relation (  ) obtained from instances with noise).In other words, the parameter values obtained from PDA failed to predict the desired judgment and instead, produced a relation  ̸ = .The indicator   is formally defined in (4).
Note that   can be seen as a measure of the level of discordance with the DM produced by the elicited parameters estimated by the PDA strategy.Also, let us point out that   is also known as the performance indicators Δ * perf and Δ  perf ; these symbols are used to represent the quality of the solutions from  small and   small , respectively, in the robustness analysis method of Section 3.1.Finally, let us indicate that the sets of instances with small reference sets are  10 and  20 , and the set of instances with large reference sets is  100 .
where  1 (, ) is a function whose value is 1 if the biconditional ( 0 ) ⇔ ( * ) is false and 0 otherwise, and  is the reference set.

Robustness Evaluation.
The experimentation on evolutionary algorithm implementations is needed to achieve better predictions about their performance and robustness.
Statistical testing plays a central role to make the analysis of experiments on algorithms a more rigorous area.Many state-of-the-art publications report conclusions in terms of statistically meaningful coefficients such as  values [27,28].
A  value is the probability, under an assumed model for the data including the null hypothesis that a statistical summary of the data (e.g., the sample mean of differences between two compared groups) would be equal to or more extreme than its observed value in the analyzed sample.The  value is an index of the incompatibility between the data in the sample and the proposed model.The smaller the  value, the greater the statistical incompatibility of the data with the null hypothesis.A small  value, for example, a value less than a level of significance of 0.05, offers some evidence against the null hypothesis.Likewise, a large  value gives some evidence in favor of the null hypothesis.A formal definition of  value is presented by Bartz-Beielstein et al. in 2010 [29].
The analysis of robustness of the Genetic Algorithm, used for parameter values estimation by Cruz-Reyes et al. in 2017, is performed by nonparametric statistical tests because the results of evolutionary algorithms regularly do not satisfy the assumptions of parametric tests [30].The robustness evaluation is centered on the demonstration that there is no significant difference between the performance of Δ * perf and Δ  perf .In other words, the quality of solutions measured by I P has similar means when using a set of instances  small or   small with or without noise.The statistical analysis uses the indicators Δ * perf and Δ  perf of the quality of solutions derived from the method in Section 3.1.The indicator Δ * perf is computed for each instance in the sets  10 ,  20 .The value of the indicator Δ  perf is also computed but using different levels of noise  = {5%, 10%, 20%, 50%} for each instance in    10 and   20 .The different values for Δ * perf and Δ  perf are grouped according to the level of noise (where a zero percent means no noise).After that, a statistical evaluation of their mean is carried out.
This work uses the STAC Web Platform (Statistical Web Tool found at URL: http://tec.citius.usc.es/stac/ranking.html), a statistical tool equipped with different statistical tests.Within this platform, the selected tests to perform the statistical evaluation were the Non-parametric multiple groups One vs All, with the particular case of Friedman Aligned Ranks Test as Ranking Test.Additionally, the following tests were specified as post hoc methods for p value adjustment: During the statistical tests, the significance level was set at 0.05.In the Ranking Test, the null hypothesis H 0 was as follows: The means of the results of two or more algorithms are the same.In the post hoc analysis, the null hypothesis H 0 was as follows: The mean of the results of the control method against other groups is equal (compared in pairs).
The results from the evaluation of the performance of each solution and from the analysis of robustness are given in the following section.

Results
This section presents the results derived from the analysis of robustness of the Genetic Algorithm that are organized in two subsections.The first subsection presents the summary of the values of performance indicator   from the estimations obtained with the sets of instances  10 and  20 to predict preference statements on the new set of instances  100 ; in both cases, the results are shown in the presence and absence of noise.The second subsection summarizes the results from the statistical evaluation; according to this data, the method reveals tolerance of the optimization approach to the presence of noise of up to 10%, a result considered to be robust.

Results from Evaluation of the Performance.
The first part of the experiment measured the performance using the indicator   .Table 2 presents its values achieved for each instance of the sets  10 and  20 .The first column of this table shows the number of instances, which were 40, or reference sets , per set.From column two to six, the values of   are presented for  10 at  = {0%, 5%, 10%, 20%, 50%}, that is, each different level of noise considered; observe that the value  = 0% corresponds to the case without noise.From column seven to eleven, the values of   are presented analogously for  20 .
From the results in Table 2, the accumulated number of inconsistencies varies by at most 250 from Δ * perf , with noise levels of up to 10%, in both sets.The difference is at least 500 inconsistencies when the noise level is 20% or greater.This result is similar when the instances are considered individually.For example, there are more than 20 instances in  10 and  20 , with noise level of up to 10% that has inconsistencies deviated at most 10% from that produced by Δ * perf ; the number of such instances is considerably reduced for greater values of noise when keeping the same deviation value.This result raises the question, what level of noise can be considered different from a situation when it does not exist?This question is addressed in the next part of the experiment, where the use of statistical evaluation provided an answer to it.Note that, from these results, it can be considered that the level of disagreement increases, as it was expected, with the increment in the level of noise.3 summarizes the results obtained from Ranking through the Friedman Aligned Ranks Test.It presents the  value obtained for the considered relations, as well as the result of the hypothesis H 0 that is being validated.Note that there is significant difference for the relations .

Results from the Statistical Evaluation. Table
Tables 4(a) and 4(b) summarize the results of post hoc analysis, for  10 and  20 , respectively.The relationships where the null hypothesis H 0 is accepted (i.e., that there is NO significant difference in the performance of the PDA method) are highlighted in bold.The results clearly show that the comparisons between the reference set without noise (noise level of 0%) against the other strategies present significant difference from a noise level of 20%.These results support the claim that the significant difference in the quality of the results of the PDA method studied starts with a noise level of 20%.
According to the results obtained, the method of learning the preferences of a DM based on the PDA method proposed by Cruz-Reyes et al. [19] proves robust, since it is statistically demonstrated that there is no significant difference in the quality of solutions provided by the method, when applied to two different sets of instances  10 and  20 , each formed of reference sets with 10 and 20 alternatives, respectively.The conducted experiment showed that with noise levels  of up to 10%, the solutions produced by the method had the same performance in predicting the preference statements as a new set  100 formed by instances with larger reference sets .
On the other hand, the method starts to have a significant difference in performance when the error rate is greater than 10%.Therefore, when the DM introduces a small number of errors in his/her preferences within the provided reference set, the analyzed optimization approach, based on the PDA method, may still produces an appropriate estimate of parameter values of the ad hoc preference model used by the DM.These results make the method robust (i.e., fault tolerant).

Concluding Remarks
The present work proposes a method for robustness analysis of PDA strategies that work with the complete set of the ELECTRE III model's parameters and the generalized outranking model with reinforced preferences proposed by Roy and Słowiński [18].The method does the parameters' elicitation with different levels of noise and validates whether there is or not significant difference in the use of such values to model preferences of a decision maker (DM).
Another contribution of this research is the noise model.This model arises from the need, in the robustness analysis, of defining bias in the reference set toward a region that represents incorrect preference statements of a DM, that is, his/her errors.The proposed noise model randomly replaces a percentage of the original preference statements by others that are closely related (resembling a vague approximation of the real desires of a DM).
The main finding derived from the experiments on the analyzed case of study, the PDA method of Cruz-Reyes et al. [19], was the identification that the method elicits parameters with a tolerance of up to 10% of error in the reference set.This tolerance means that even though the DM could introduce by mistake some wrong preference statements in the reference set, the PDA method is still able to estimate proper parameter values for the preference model that characterize him/her.Within this context, it could be possible to use the proposed It is important to note that this work assumes that the reference set is provided as preference statements over pairs of alternatives and does not consider cases where the preference information could be given by a set of sorted alternatives.For the latter case at least a different noise model would be required, and hence, it would constitute a line of research for future work.

Table 4 :
Results of post hoc analysis for sets S 10 and S 20 .Summary of the adjusted  values obtained from different tests.(a) Adjusted  values obtained for set  10 with different tests on other PDA strategies in the frame of outranking approaches that handle reference set as pairs of alternatives. method