Application of Set Pair Analysis Method Based on Entropy Weight in Groundwater Quality Assessment-A Case Study in Dongsheng City , Northwest China

Groundwater quality assessment is an essential study which plays important roles in the rational development and utilization of groundwater. Groundwater quality greatly influences the health of local people. However, most traditional water quality comprehensive assessment methods which have complicated formulas are difficult to apply in water quality assessment. In this paper, a novel method for groundwater quality assessment called set pair analysis was introduced and entropy weight was assigned to each index to improve the assessment model. The calculation steps are depicted in the paper and take groundwater quality assessment in Dongsheng City as a case study. The assessment results indicated that groundwater qualities in the study area were relatively good, Set Pair Analysis method, which was an optimal method for groundwater quality assessment and worth promoting, was easy to use and calculation processes which use almost all the relative information were simple, results were reasonable, reliable and intuitive.


Introduction
Groundwater is vital to local people in arid and semi-arid area and nearly 70% of the water supply for human consumption is often sourced from groundwater in Northwest China.Whether groundwater is suitable for drinking is an essential question that needs discussion.Till now, many groundwater quality assessment methods have been proposed by many scholars such as water quality index (WQI) method 1 , fuzzy mathematics method 2 , osculating value method 3 , grey cluster method 4 and artificial neural network method 5 .However, water quality assessment is a combination of certainty of evaluation criteria and uncertainty of analysis results.Due to the complexity and randomicity of the evaluation factors and the nonlinear relationship between water quality criteria and the evaluation indices, it is difficult to establish a unified evaluation model, which promotes the further study on the water quality assessment methods.
Set Pair Analysis (SPA) was proposed by Zhao 6 , a Chinese scholar, in 1989.It is a modified uncertainty theory considering both certainties and uncertainties as an integrated certain-uncertain system and depicting the certainty and uncertainty systematically from three aspects as identity, discrepancy and contrary [6][7][8] .In the SPA theory, two relative sets in an uncertainty system are constructed and connection degree of the set pair can be established according to the identity, discrepancy and contrary.Based on the connection degree formula, a series of SPA-based researches have been conducted.Su et al. 9 employed SPA to assess the urban ecosystem health level and concluded that SPA can serve as an effective relative-measure to compare different ecosystem health levels of urban ecosystems.Zhou 10 used a SPA based fuzzy assessment method in the real-time risk assessment.Gao and Chen 11 introduced SPA in risk ranking and found that the approach was very convenient to operate and the ranking result was more comprehensive.Wang et al. 12 introduced SPA into water resources system assessment and studied two cases.They concluded that the SPA was simple in concept, convenient to calculate and feasible for application.
As can be seen from above information, SPA has been introduced into many fields and shown its excellence both in concept and application.However, SPA is a newly proposed theory and needs further in-depth study.Since it was introduced into water system, it has drawn much attention.In the previous studies, the average connection degree of the set pair was the arithmetic mean of every indicator, which is not appropriate because of the different importance of the different indicators.In this paper, some improvements were made to the original SPA method by introducing information entropy theory to determine the weights of evaluation indexes.This will surely provide a new way of thinking and methods for the evaluation of groundwater quality.

The principle of SPA
The basis of SPA is set pair and its key is connection degree.Suppose there are two sets, A and B. Put them together to form a set pair H (A, B).To investigate the relationship of H (A, B) and judge its good or bad degree, the connection degree is defined as Where S is the number of the identical terms of characteristic; F is the number of the discrepant terms of characteristic; P is the number of contradictory terms of characteristic.Here, a=S/N, b=F/N and c=P/N, respectively, represent the identity degree, discrepancy degree and contrary degree of these two sets and we can know that in the formula a+b+c=1.In the formula, the term i is the uncertainty coefficient of discrepancy which has different values ranging in [-1, 1] in different conditions; j is the uncertainty coefficient of contradictory which is specified as -1.
Equation (1) can not only reflect the relationship of overall structure of sets A and B, but also a, b and c can reflect the relationship of internal subtle structure of sets A and B. The connection degree has overcome the drawbacks of such traditional forms of relationship as correlation coefficient, subordinate degree or grey correlation degree, in which there is a single index only 12 .The SPA can not only show the relationship structure clearly, but also can reveal quantitatively three or more characteristics in complex relationship as well as can give the changeable value of comprehensive relationship, which may depend on required different standards or properly selected i. Due to its advantages as a simple mathematic depiction which has clear physical meaning, the SPA has been introduced into various fields.

SPA in groundwater quality assessment
The essence of assessment is the comparison of similarity between two objects in characteristics.If they are similar, they will be classified into one class, if not, they will be classified into different classes.Water quality comprehensive assessment is made based on similarity between monitoring sample and water quality standards.In order to make water quality assessment based on SPA method, a set pair must be made by putting the concentration of every index and water quality standards together first.Then, the connection degree between every index in every monitoring sample and standards must be calculated.Then the average connection degree of every monitoring sample can be obtained.Water quality rank will be decided according to the average connection degree.According to the Groundwater Quality Standards of China, groundwater quality can be classified into 5 ranks.Rank I (excellent quality water), rank II (good quality water), rank III(Medium or average quality water), rank IV (poor quality water) and rank V (extremely poor water).The groundwater quality ranking standards are shown in Table 1.

Table 1. Groundwater quality ranking standards
Note: Unites in the column are mg•L -1 Connection degree plays a key role in water quality comprehensive assessment based on SPA, which can be calculated by following formulas.
(1) Connection degree between index j and standard of Rank I (2) (2) Connection degree between index j and standard of Rank II (1)  (1) (1) (3) Connection degree between index j and standard of Rank III (2)  (1) (2) [0, ] j j j j j j j j j j j j j j j x S x S S S S x S S x S x S S S S x S S (4) Connection degree between index j and standard of Rank IV [0, ] j j j j j j j j j j j j j j j x S x S S S S x S S x S x S S S S x S S (5) Connection degree between index j and standard of Rank V , ] Where S j (1) , S j(2) , S j(3) , S j(4) and S j (5) are the ranking standards of every rank, x is the sample index value; µ j1 , µ j2 , µ j3 , µ j4 and µ j5 are the connection degrees between every index and every ranking standard.
When the connection degrees between every index and every ranking standard are calculated, the average connection degree of every sample can be calculated by the following formula.
Where µ k is the average connection degree, ω j is the weight of every index, µ jk the connection degree between every index and every ranking standard.Thus, according to average connection degree, water quality rank can be decided by below formula.

} max{
) Where δ i is the rank which sample i belong to.

Entropy weight
The various indexes have different effects in the assessment system, some are notable and others are weak, therefore, a different weight should be given to each index respectively.The weight can be determined by various methods such as Analytic hierarchy process (AHP), Delphi method and information entropy method.
The concept of information entropy was first proposed by Shannon 13 in 1948 and it was regarded as the uncertainty of a stochastic event or metric of information content.The steps for calculating entropy weight are described as follows: Suppose there are m water samples taken to evaluate the water quality (i=1,2,…,m).Each sample has n evaluated parameters (j=1,2,…,n).According to real data, eigenvalue matrix X can be constructed: (9)   In order to eliminate the influence caused by the difference of different units of characteristic indices and different quantity grades, data pretreatment must be put into force 13 .
According to attribution of every index, the feature indexes may be divided into four types: efficiency type, cost type, fixed type and interval type 13 .For the efficiency type, the construction function of normalization is: While for the cost type, the construction function of normalization is: After transform, the standard-grade matrix Y can be obtained and shown below:  (12)   Then the ratio of index value of the j index in i sample is: The information entropy is expressed by the formula below: The smaller the value of e j is, the bigger the effect of j index.Then the entropy weight can be calculated with the below formula: In the formula, ω j is defined as the entropy weight of j parameter.

Case study
Dongsheng city, a typical semi-arid area, is situated in the south of Erdos Basin, Inner Mongolia, China.In this area, groundwater in the main drinking water resources and groundwater quality is greatly influencing the local people's health and local economy 14 .Total 15 groundwater samples were collected in August and September 2007.Samples were collected in pre-cleaned plastic polyethylene bottles for physicochemical analysis.Prior to sampling, all the sampling containers were washed and rinsed thoroughly with the groundwater to be taken for analysis.Each of the groundwater samples was analyzed in the laboratory of Xi'an Institute of Geology and Minerals Resources using standard procedures recommended by Chinese Ministry of Water Resources.In this paper, Only 4 indices having significant influences on the water quality including chloride, sulphate, total hardness (TH) and total dissolved solid (TDS) were selected for the groundwater quality comprehensive assessment.Sample analysis results are shown in Table 2. Sample D1014 was taken as a computing example.Connection degree between every index and every ranking standard can be calculated from equation 2-6.The results are shown in Table 3.
Table 3 According to formula (9) to formula (15), information entropy and entropy weight of every index can be calculated and the calculated results are listed below in Table 4.According to equation 7, average connection degree of sample D1014 can be obtained.The calculated results are 0.45, 0.99, -0.45, -0.99 and -1.As can be seen, the connection degree between sample D 1014 and rank II is 0.99 which is the largest.Hence, the rank of sample D1014 belongs to rank II.The ranks of other samples can be calculated in the same way and the calculated results are listed in Table 5.As can be seen from Table 5 that sample W117, W120, W114, W115 and W201 belong to excellent quality water (rank I), while the other 10 samples belong to good quality water (rank II), which indicates that groundwater in the study area can be applied for consumption without any pretreatment.A comparison was made between the assessment results based on SPA method and assessment results based on fuzzy mathematics 2 .The comparison showed that the assessment results based on the two different methods are consistent.The assessment results are also in accordance with the field investigation facts.There are few residents and there is no major industry in and around the study area, which implies that there is no pollution.Seen from the calculation process, SPA can be regarded as a simple method which is easy to use.

Conclusion
In the paper, a novel method for groundwater quality comprehensive assessment called Set Pair Analysis (SPA) was introduced.A weight based on information entropy was assigned to each index.The steps were depicted in detail and water quality assessment in Dongsheng City was taken as a computing example.The calculated results show that groundwater quality in Dongsheng City belongs to excellent quality (rank I) and good quality (rank II) that is suitable for consumption.SPA method, which is an optimal method for groundwater quality assessment and worth promoting, has simple formulas and is easy to use.The calculation processes use almost all the relative information and entropy weight is assigned to each index, which makes the assessment results reasonable, reliable and intuitive.

Table 2 .
Water sample analysis results

.
Connection degree between every index and ranking standard in sample D1014

Table 4 .
Information entropy and entropy weight of parameters

Table 5 .
Results of water quality comprehensive assessment