Investment Decision Support for Engineering Projects Based on Risk Correlation Analysis

Investment decisions are usually made on the basis of the subjective judgments of experts subjected to the information gap during the preliminary stages of a project. As a consequence, a series of errors in risk prediction and/or decision-making will be generated leading to out of control investment and project failure. In this paper, the variable fuzzy set theory and intelligent algorithms integrated with case-based reasoning are presented. The proposed algorithm manages the numerous fuzzy concepts and variable factors of a project and also sets up the decision-making process in accordance with past cases and experiences. Furthermore, it decreases the calculation difficulty and reduces the decision-making reaction time. Three types of risk correlations combined with different characteristics of engineering projects are summarized, and each of these correlations is expounded at the project investment decision-making stage. Quantitative and qualitative change theories of variable fuzzy sets are also addressed for investment risk warning. The approach presented in this paper enables the risk analysis in a simple and intuitive manner and realizes the integration of objective and subjective risk assessments within the decision-makers’ risk expectation.


Introduction
The purpose of engineering investment is to obtain satisfactory returns; such decisions however are affected by considerable uncertainties.Expected revenue largely depends on the analysis and control of these incertitudes.These uncertainties have been a constant from the perspective of the entire investment process, and investment decision-making plays a fundamental role because it is the starting point of the entire investment process.According to the expert estimation and case studies of major projects, early decision-making exerts a magnitude of influence of 70% or higher over an entire project.A large number of projects fail due to the errors in initial investment decisions.In the investment decision-making process of large-scale projects, many risk factors can cause decision failure.The most crucial factors

Risk Correlation of Project Investment Decision-Making
As a sub-branch of artificial intelligence, the CBR is a mode of reasoning that generates solutions to current problems by studying solutions to past problems of a similar nature stored in a knowledge database 8 .The method reuses past cases and experiences to solve new problems, evaluates new problems, explains abnormal conditions, and understands new conditions.Figure 1 shows the analysis flow in project investment.
The more cases stored, the more comprehensive the reference value.The research focus of the CBR is mainly on case storage, case retrieval, and similarity algorithm.However, it takes each case as independent items for research, rather than thoroughly studying the correlation between cases and attributes.Not only can it exclude the particularity of individual cases, but also reflects the essential characteristics of a wide range of cases to explore, at a certain confidence level, risk-inherent correlations using a large number of cases.Currently, the application of CBR in engineering projects is at its initial stage; thus, few studies on risk correlation mining have been conducted.However difficult, the key to exploring risk correlations and the factors relevant to these correlations is to determine the process involved in investment decision-making in engineering projects.On the basis of these insights, we uncover and compile three kinds of risk correlations see Table 1 by combining the correlation mining methods applied in other domains and the risk factors present in engineering projects.The process is described as follows.
1 An existing qualitative correlation is always existed as an influencing factor and index set, and it is also found in risk identification links.It has been accepted knowledge and this type of correlation is the one most easily identified.For example, as we know, investment decision-making composed of a series of first, second, and even multigrade risk indexes see Figure 2 .In risk prediction, the risk factors affecting target sets are first listed.Subsequently, these factors are divided into two grades or more according to the category they belong to.Subjective scoring methods, such as Analytic Hierarchy Process, fuzzy comprehensive  evaluation, and so on, are typically used to calculate the degree of influence of each risk factor, that is, the weight that it carries 9-11 .As far as qualitative correlation is concerned, weight and method cannot be considered, although many risk factors can be listed by subjective experiences rather than by scientific methods.Confidence levels can even reach 100%.
2 Derivation correlation mainly comprises type derivation, influence degree derivation, causality derivation, optimum derivation, and formula derivation.This paper provides examples of the above-mentioned derivations based on CBR, risk prediction, and risk management at the investment decision-making stage.The types of derivation correlations are discussed as follows.

(a) Type Derivation Uses the Clustering Method to Classify Existing Project Cases
It uncovers risk events, risk occurrence probabilities, and risk solutions of each type and summarizes these to serve as individual category markers.In new projects, cases can be searched and similarity can be calculated based on this derivation.It also can learn from the risk data and risk measures of its category for decision-making.When a new project is completed, it stores information in a case database for future project risk management.This approach is a process of self and incremental learning.
Take variable fuzzy clustering iterative model as an example.Suppose n samples to be clustered compose a set, m is the sample index number, m × n is the eigenvalue matrix given in 2.1 that can be used for sample set clustering: where x ij is the eigenvalue of index i of the clustering sample j, i 1, 2, . . ., m; j 1, 2, . . ., n.
Each index has a different dimension and magnitude which means that there are positive and negative indicators.Therefore, the original data must be normalized, and the normalized number must be in 0, 1 range.Different normalized methods can be used according to specific problems.Matrix X can be used after the normalized transfers into the index eigenvalue normalization matrix in where r ij the index eigenvalue normalization number, r j r 1j , r 2j , . . ., r nj is the index eigenvalue vector of sample j.
Suppose the sample set is divided into c classes, s j s 1h , s 2h , . . ., s mh is the fuzzy clustering center vector of class h, where h 1, 2, . . ., c; 0 ≤ s ih ≤ 1, p is the distance parameter, α is the optimal criteria parameter, and ω ω 1 , ω 2 , . . ., ω m is the index weight vector.Equation 2.3 are the optimal fuzzy clustering matrix u * hj and the fuzzy clustering center matrix s * ih :

2.3
In this model, the sample weights, relative membership degree, and cluster centers tend to be stable in the dynamic iteration.And the advantage of this model is that it not only considers the index weight but also the relative membership degree u hj .Thus, the sample j belongs to the class h as another weight, resulting in a developed and perfect weight distance.And based on that, to some extent, the accuracy of type derivation could also be improved.

Mathematical Problems in Engineering (b) Influence Degree Derivation Mainly Aims at Weight and Risk Consequence
Suppose risk event A is more important or has more serious consequences compared with event B, and B is more important than C. Certainly, A is more important than C.That is A > B and B > C, so A > C. Thus, in the risk prediction and risk control of a new project, A should be paid more attention to than B and C to avoid risk losses.

(c) Causality Derivation Is Similar to Influence Degree Derivation
Suppose that in some link, risk event C is directly caused by event B, and B is directly caused by A. Meanwhile, event C is more serious than B, and B is more serious than A. Hence, when event A occurs, the transformation condition from A to B and B to C should be controlled in a timely manner to prevent a more serious C from occurring.A causal correlation can be revealed from a wide range of existent cases, and this correlation can resolve the risk loss before it escalates.

(d) Optimum Derivation Consists of Project Time Optimization, Cost Optimization, Resource Optimization, Bi-Objective Optimization, and Multi-Objective Optimization
Combined with the construction period, cost, and resource allocation of completed projects, optimal project duration, and optimal cost interval can be summarized based on existent cases in each category.The construction period, cost, and resources of a new project can be reasonably controlled, based on category data.It can effectively reduce certain risks.

(e) Formula Derivation Mainly Uses the Western Economic Principles Associated with Mathematical Statistics Methods
It can be used as a reference value for improving risk assessment of investment decisionmaking to reasonably deduce the risk quantitative correlations.Because of the difficulties involved in risk quantitative analysis, there are a few investigations in such type of correlation which only limited to macroeconomic and financial risks.At present this type of correlation only includes the relationship between a single risk factor and the investment target.We should study not only on comprehensive effect of risk factors, but also on exploring quantitative relationship among risk factors.Because this type involves quantitative analyses, along with some potential assumptions in the derivation process, this correlation has inferior confidences but with better interestingness than the first type.Moreover it is on the basis of economics rigorous formulas and statistical inference; therefore it provides some scientific reference values.Because of the difficulties involved in risk quantitative analysis, this paper focuses only on risk measurement derivation.Risk measurement is one of the indicators in determining risk intensity.Risks are understood differently, bringing forth varied risk measurement techniques as well.Equation 2.4 is one representation of this type of derivation.
If the probability distribution of risk event X is unknown, the empirical distribution of X can be obtained through statistical analysis.Thus, the risk intensity of event X is where E X is the mean value of sample X which is expressed as where n is the sample size, X i is the value of sample ith sample point, and V X is the sample variance.This derivation correlation process combines basic economics and statistics formulas with investment risk factors.Fluctuations in the prices of project resources or supply volume exert some influence on total investment.According to the western economics, a relationship exists between the resource price and the supply volume, depicted as follows 12 : where E S is the elasticity of supply price, P is the resource price, Q represents the supply volume, ΔP is the incremental price of resource, and ΔQ denotes the incremental volume of supply.Suppose X ΔP/P and Y ΔQ/Q, respectively, represent the rate of change in the resource price and the supply volume.Combined with 2.6 , it is expressed as Suppose X is a random variable, E S is the constant while the market condition is stable, and Y is also a random variable with mean and variance as follows: Calculating the risk degree of Y using 2.4 , we obtain 2.9 From 2.9 , the risk degree F Y of Y is equal to the risk degree F X of X.
Because of the market risk, the maximum resource price is as follows: P max E P V P E P 1 F P .

2.10
i If F P 0 does not consider the market risk, the initial value of the resource price would be P 0 E P .
ii If F P / 0 has taken the market risk into account, then the mean of the incremental resource price would be ΔP 1/2 P − P 0 1/2 P 0 • F P .

Mathematical Problems in Engineering
iii Because of fixed costs, the increment of total investment caused by risks is equal to that of the resource value consumption.Thus, the average increase rate of total investment induced by risks is where Q i is the consumption of ith resource, P 0i is the initial value of ith resource price, F pi represents the risk degree of ith resource price, and B 0 denotes the initial estimation of total investment.Such correlation study based on CBR is still relatively rare.In traditional project management, experiential knowledge is often lost at the end of the project.CBR, therefore, is not only a repository of existing cases, but also provides a platform for case summaries and knowledge mining.Because the derivation of such correlation assumptions and fault tolerance is allowed, the subsequent correlation presents lower confidence but is more interesting.Moreover, this correlation is based on CBR and concrete data of completed projects; thus, it is of scientific reference value.
3 People are often interested in potential correlations hidden under data.Therefore, correlation mining through objective and subjective methods presents the highest interest of all the three correlations.Currently, research on such correlations is divided into two types: one focuses mainly on algorithm improvement and computer programming for algorithms; this approach has few applications in engineering.The other employs practical analysis, but targets only individuality and not generality.However, theory has to be indispensable for practical use.We therefore uncover six correlations that can provide decision support for investment decision-making in engineering projects.These are investment deviation prediction, schedule deviation prediction, quantitative and qualitative risk change, dynamic correlation of risk factors, risk warning threshold, and incremental and self-adaptive correlation.According to different sources of data, this type of correlation mining is divided into two categories: one is derived from qualitative data, and the other from quantitative data.

(a) Qualitative Data
Qualitative data is obtained mainly from expert scoring, Boolean values, characteristic values, and so forth.For example, index weight is a form of this correlation.It mainly depends on expert scoring, indicating that the important relationship among all the risk factors.Many studies on weight determination and improvement have been carried out.This paper focuses on studying the relationship among risk factors with a certain confidence level based on variable fuzzy sets, rough sets, Bayesian, decision tree and support vector machines, and so forth A fuzzy set usually has variability of time, space, and conditions, particularly in the engineering project investment.Because of the uncertainties that characterize a given project and its environment variables in the investment stage, fuzzy theory in engineering project research needs to upgrade mathematical theories, models, and methods.We use variable fuzzy sets for a better fit.To yield sound, adaptive, and heuristic investment decisions, as well as improve forecast quality and reduce reaction time in actual situations, the decision-maker would require intelligent algorithms other than CBR, such as rough sets, to mathematically address fuzziness and uncertainties.The decisions or classification rules can be derived through knowledge reduction based on the premise of invariable classification ability.Eleven practical cases of the same category are analyzed, and their unit prices are mainly constrained by eight influencing factors.The eight factors and their grade distributions are enumerated in Table 2.The attribute classifications of each case are shown in Table 3 13 .The rough sets method does not require any prior knowledge other than the dataset that requires processing; thus, we adopt this method to reduce the attributes that influence construction unit price based on the practical data shown in Table 3.
In rough sets, U / Ø indicates that the universe is the finite set of objects.Suppose R is an equivalence relationship in U, and U/R represents the set of all equivalence classes Mathematical Problems in Engineering without R.If P ∈ R and P / Ø, then ∩P denotes the intersection of all equivalence relationships in P • P is also an equivalence relationship.∩P is the indiscernibility relationship in P, denoted by ind P .Therefore, U/ind P denotes the knowledge related to the equivalence relationship with family P; it is usually denoted by U/P.

2.12
Calculating whether ind R is equal to ind R-{R} yields attribute cores R 1 , R 4 , and R 8 .Non-reduction attributes are not unique, and these are {R The rough sets method decreases the condition attributes to five, which substantially reduces computational complexity and improves decision-making efficiency.In actual forecasts, the more practical cases there are, the better the scientific decision support for the project.However, as the number of cases increases, the dependence among risk factors may change.It is practical, therefore, to study incremental correlation, which allows for certain error rates in the investment decision-making stage.

(b) Quantitative Data
The primary sources of quantitative data are the objective data of each project.There are two approaches to process these data.In the first scheme, quantitative data are transformed into qualitative form by triangular fuzzy or trapezoidal fuzzy method, and then the correlations are derived from the qualitative data.In the second scheme, objective data are directly analyzed to explore correlations 14-16 .In accordance with the second processing method,  we analyze the comprehensive effects of risk factors on risk monitoring and risk warning via quantitative and qualitative change theories of the variable fuzzy sets method.
The fuzzy sets concept was proposed by Zadeh in 1965, which was then developed into a new mathematical discipline-fuzzy sets theory.However, fuzzy sets are a static theory that cannot describe the dynamic variability of fuzziness, fuzzy events, or fuzzy concepts.Theoretically, using static fuzzy sets theory to study the dynamics of fuzziness is an insufficient approach.Contradictions exist between theoretical studies and research objectives.Chen 17 proposed the relative membership degree and relative membership function in the 1990s.He established engineering fuzzy sets theory 18 .In the early 21st century, Chen 18-21 created the variable fuzzy sets theory, which was a breakthrough in static concepts and theory of fuzzy sets.
Using variable fuzzy sets with the relative membership function to describe intermediate transition is a dynamic demonstration of fuzziness by precise mathematical language.Suppose U is a universe, and u is the element of U, u ∈ U. A and A c is a pair of opposite fuzzy concept in u.At any point in the continuum number axis of the relative membership function, μ A u is a relative membership degree of u to A, and μ c A u is a relative membership degree of u to A c , where A c is opposite A. μ A u μ c A u 1, where 0 ≤ μ A u ≤ 1 and 0 ≤ μ c A u ≤ 1. Seen from Figure 3, on left pole P l : μ A u 1, μ c A u 0; and on right pole P r : μ A u 1, μ c A u 0; P m is the gradual qualitative change point whose continuum is 1, 0 to A and 0, 1 to A c , and μ A u μ c A u 0.5.Suppose D A u is the relative difference degree of u to A, and D A u μ A u − μ A c u .It is seen from Figure 4 that point P m is where D A u 0 denotes the point at which dynamic balance with gradual qualitative change is reached.Points P l and P r are where D A u 1 and −1 represent the points at which mutational qualitative change is reached.Thus, the two forms of qualitative change, that is, gradual change and mutation, can be completely and clearly expressed by the relative difference degree.
Suppose C is one variable factor set of V , and C A is the variable factor set, C B represents the variable spatial factor set, C C denotes the variable condition factor set, C D is the variable model set, C E stands for the variable parameter set, and C F is the other variable factor set.
The standard models for evaluating quantitative or qualitative change in a variable fuzzy set are as follows: i The criterion for quantitative change is ii The criterion for gradual qualitative change is D A u • D A C u < 0. iii Two criteria are assigned to mutational qualitative change: a if the change occurs not through the gradual qualitative change point, b if the change occurs through the gradual qualitative change point, The data in Tables 4 and 5 were taken from a highway construction project 22 .We analyze the quantitative and qualitative changes in risk factors during the construction period to provide reference for determining the risk threshold value.X 1 is the deviation rate of the total investment cost, and X 2 is the deviation rate of the schedule.This work comprehensively evaluates risks based on these two deviation rates.The relative difference degree of investment cost and schedule from February 2003 to September 2003 is calculated according to eigenvalues of a, b and b, d see Table 6 , respectively.
Suppose the weight vector of the two indexes is ω 0.5, 0.5 , and the risk relative difference degree of each month is D A u 2 i 1 ω i D A u i .The relative difference degree of comprehensive risk of each month is shown in Table 7.The tendency of the value of D A u to move closer to −1 indicates high risk.By contrast, its tendency to move closer to 1 indicates low risk.Table 7 shows that the changes occurring from April 2003 to May 2003 are gradual qualitative changes, whereas the other continuous intervals are quantitative changes.The result in Table 7 is simple and intuitionistic.In addition, the decision-maker can combine the results with his own risk tolerance to determine the risk threshold required to implement appropriate measures.Our proposed method combines objective and subjective evaluations.

Conclusions
Research on risk correlation remains a bottleneck in current risk management in engineering project investment decision-making.We divide risk correlation into three types and elucidate the third correlation using actual data.The proposed approach combines data mining and variable fuzzy sets with investment decision-making, yielding simple, intuitionistic, and easily explainable results.Findings generated from this study provide reference value because it comprehensively considers risk prediction, risk management, and cost reduction analysis.Dynamics correlation and incremental correlation are the directions for further study.
Input the attribute and data of target case Case display Search matching case from the case bank Transfer case data Analysis each condition in project Store the completed project into the case bank factors Search potential rules based on case data Case conforms to requirement?

Figure 1 :
Figure 1: Analysis flow based on CBR technique.

Figure 2 :
Figure 2: Risk indexes of investment decision-making.

Figure 4 :
Figure 4: Schematic diagram of relative difference function.
D A u x i − b i / a i − b i , where xi ∈ ai, bi and D A u x i − b i / d i − b i, where xi ∈ bi, di is the linear equation of relative difference degree 23 .

Table 6 :
The eigenvalues of a, b and b, d of the deviation index.

Table 1 :
Risk correlation classes of engineering project investment decision-making.

Table 2 :
Attributes used in CBR prediction model.

Table 3 :
Classes specified for output attribute of cost per m 2 .
R 8 } is the condition attribute set, and D {Output} is the target set.Their respective equivalence relationships based on attribute values are as follows:

Table 4 :
Basic data from Feb. 2003 to Sep. 2003 of section F in the contract.

Table 5 :
Calculation of investment cost risk of section F in the contract.

Table 7 :
Relative difference degrees from Feb. 2003 to Sep. 2003 of section F in the contract.