Correlation coefficients are commonly found with crisp data. In this paper, we use Pearson’s correlation coefficient and propose a method for evaluating correlation coefficients for fuzzy interval data. Our empirical studies involve the relationship between mathematics achievement and other projects.
1. Introduction
Human thought processes are mainly based on cognitive awareness of the environment and social phenomena. Human knowledge is fuzzy because of humans’ subjective awareness of time and space. Therefore, Wu [1] proposed fuzzy theory in reference to how humans perceive complex and uncertain environmental phenomena.
To determine the correlation between phenomena X and Y, a scatter plot is often used. Using a scatter plot, the correlation between phenomena X and Y can be determined to be positive, negative, or statistically independent.
In traditional statistical analysis, correlation coefficients are often found using crisp data. In this paper, we use Pearson’s correlation coefficient to calculate correlation coefficients for fuzzy interval data. Fuzzy correlation coefficients are often applied in the fields of engineering or economics but have also been increasingly emphasized in social sciences.
Fuzzy correlations are referenced in the literature. For instance, Nguyen et al. [2, 3] provided the fundamentals of statistics with fuzzy data. Hong and Hwang [4] established the correlation coefficient of intuitionistic fuzzy sets in probability space by using the generalization of fuzzy sets by Zadeh [5]. Chiang and Lin [6] argued that membership degrees are concrete observational values based on the membership functions of fuzzy sets to define fuzzy correlation coefficients. Chaudhuri and Bhattacharya [7] investigated the correlation of two fuzzy sets that were defined by the members of the supports, which were ranked to evaluate the correlation coefficients of two fuzzy sets. Hong [8] and Ni and Cheung [9] also suggested some methods for calculating fuzzy correlations. Based on correlation coefficients developed by Liu and Kao [10], Xie and Wu [11] and Yang [12] established fuzzy correlation coefficients and obtained fuzzy correlation intervals based on fuzzy interval sample data. R. Saneifard and R. Saneifard [13] calculated the correlation coefficient for fuzzy data by adopting the method from central interval. Cheng and Yang [14] proposed a method for determining fuzzy correlation coefficients and explained the application of fuzzy correlation. Hanafy et al. [15] evaluated the correlation coefficients of neutrosophic sets by centroid method. Wu et al. [16] developed a new approach for determining fuzzy correlation and applied this approach to 12-year compulsory education in Taiwan. Lin et al. [17] investigated some problems on marketing research by using a soft computing technique and a new statistics tool.
The main purpose of this paper is to develop fuzzy correlation coefficients for fuzzy interval data. We propose a functional formula for determining fuzzy correlation coefficients of two variables. We can find the maximum and minimum values by differentiating our proposed functional formula. However, the formula can be applied not only when the value of one of two data sets is a real number but also when both data sets are real numbers. Using this method of research, we can provide information for researchers to explain related phenomena in practice.
2. Research Approaches
Let (xi,yi),i=1,2,…,n, be a fuzzy sample set; then, the correlation coefficient between x and y is defined as(1)rxy=∑i=1nxi-x¯yi-y¯∑i=1nxi-x¯2∑i=1nyi-y¯2,where x¯ and y¯ are sample means for xi and yi, respectively.
Definition 1 (fuzzy interval number).
Let a fuzzy number A=[a,b] be an interval over the real number R, let c=a+b/2 be the center of interval A, and let r=(b-a)/2 be the radius of interval A; then, interval A can be expressed as A=[a,b] or A=(c;r). Consider interval A a fuzzy interval number.
Consider the fuzzy sample set (xi,yi),i=1,2,…,n, where xi=[xi1,xi2] and yi=[yi1,yi2] are fuzzy interval numbers, as shown in Figure 1.
The algorithm of the correlation coefficient between x and y consists of the following five steps.
The scatter plot with fuzzy interval data.
Step 1.
For any fuzzy interval number, xi=[xi1,xi2] and yi=[yi1,yi2], xi⊗yi is defined by a rectangle. In addition, the rectangle xi⊗yi has four vertices, Ai, Bi, Ci, and Di, the coordinates of which are (xi1,yi1), (xi2,yi1), (xi2,yi2), and (xi1,yi2), respectively, as shown in Figure 2.
The graph of a rectangle xi⊗yi.
Step 2.
Choose a point Ei lying in the line segment AiBi¯ such that two segments’ proportion AiEi¯:EiBi¯=s:(1-s), where 0≤s≤1, and the point coordinate Ei(xi1+s(xi2-xi1),yi1) is obtained, as shown in Figure 3.
The graph of a rectangle xi⊗yi.
Step 3.
Choose a point Gi lying in the line segment CiDi¯ such that EiGi¯ parallels AiDi¯. Next, choose a point Fi lying in the line segment EiGi¯ such that two segments’ proportion EiFi¯:FiGi¯=t:(1-t), where 0≤t≤1, and the point coordinate Fi(xi1+s(xi2-xi1),yi1+t(yi2-yi1)) is obtained, as shown in Figure 4.
The graph of a rectangle xi⊗yi.
Definition 2.
The domain set Ω={(s,t)∣0≤s≤1,0≤t≤1}.
Step 4.
Calculate the correlation coefficient function rxy between x and y by using formula (1). In this case, the correlation coefficient function rxy is a function of two variables s and t for the closed region bounded by Ω and is expressed as rxy=f(s,t).
Step 5.
By the differentiation method, we can find the maximum and minimum values of the correlation coefficient function rxy.
2.1. The Assumption of Corresponding Points of Each Rectangle
Our initial idea is to find the correlation coefficient for corresponding points of each rectangle. For Example 3, we can find that the correlation coefficient rxy=0 for the centroid of each rectangle.
Example 3.
Consider the rectangle sample data x1⊗y1=[0,2]⊗[4,6], x2⊗y2=[4,8]⊗[8,10], and x3⊗y3=[10,12]⊗[0,10], as shown in Figure 5.
For Example 3, we also can find the correlation coefficient rxy=0.918 for the upper-right point coordinate, the correlation coefficient rxy=-0.397 for the lower-right point coordinate, the correlation coefficient rxy=-0.596 for the lower-left point coordinate, and the correlation coefficient rxy=0.803 for the upper-left point coordinate of each rectangle, as shown in Figures 6, 7, 8, and 9.
We also can find the correlation coefficient rxy=0.466 for interior corresponding point of each rectangle; for example, the coordinate (s,t)=(1/3,3/4) as shown in Figure 10.
We assume that the coordinate (s,t) of corresponding point of each rectangle is fixed; for example, the coordinate (s,t)=(1/2,1/2) for Figure 5 and the coordinate (s,t)=(1,1) for Figure 6. However, we do not consider the case that each rectangle may have different coordinates (s,t), for example, as shown in Figure 11.
The coordinate (s,t)=(0,1) of rectangle A, the coordinate (s,t)=(1,0) of rectangle B, and the coordinate (s,t)=(0,0) of rectangle C in Figure 11; their coordinates (s,t) are not different.
According to Figures 6–9, we cannot say the maximal value of the correlation coefficient rxy=0.918 and the minimal value of the correlation coefficient rxy=-0.596, because there are infinitely many corresponding points for boundary points and interior points of each rectangle; we must evaluate every correlation coefficient for corresponding points of each rectangle. Therefore, we use Steps 1 to 5 and the differential rule of two variables to evaluate the maximal and the minimal values of the correlation coefficient.
Based on Example 3, three point coordinates, F1(2s,4+2t), F2(4+4s,8+2t), and F3(10+2s,10t), are obtained. We then find the sample means x¯=14+8s/3 and y¯=12+14t/3, respectively. Therefore, the correlation coefficient function between x and y is (2)rxy=∑i=1nxi-x¯yi-y¯∑i=1nxi-x¯2∑i=1nyi-y¯2=fs,t=-9+3s+16-2st219-s+s23-6t+4t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (3)fss,t=105-60t+15s-30st43-6t+4t2·19-s+s23/2,fts,t=42+6s-24t-12st419-s+s2·3-6t+4t23/2.
Let fs(s,t)=ft(s,t)=0; it follows that (s+2)(2t-1)=5. There is no critical point for the equation (s+2)(2t-1)=5 bounded by Ω. The reason is as follows: If 0≤t≤1, then we obtain -7≤s≤-1/3, and if 0≤s≤1, then we obtain 4/3≤t≤7/4. Hence, their critical points on the equation (s+2)(2t-1)=5 do not belong to the set Ω.
The boundary of the region consists of the lines s=0, s=1, t=0, and t=1. Consideration of extrema on the boundary of the region along s=0 leads to the function f(0,t)=-9+16t/2193-6t+4t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (0,0) and (0,1). Consideration of extrema on the boundary of the region along s=1 leads to the function f(1,t)=-6+14t/2193-6t+4t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (1,0) and (1,1). Consideration of extrema on the boundary of the region along t=0 leads to the function f(s,0)=-9+3s/2319-s+s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,0) and (1,0). Consideration of extrema on the boundary of the region along t=1 leads to the function f(s,1)=7+s/219-s+s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,1) and (1,1). All candidates for the maximum and minimum values are listed in Table 1. We see that the minimum value is f(0,0)≈-0.596; the maximum value is f(1,1)≈0.918. Therefore, the fuzzy interval number of the correlation coefficient is [-0.596,0.918].
(s,t)
f(s,t)
(0,0)
-0.596
(0,1)
0.803
(1,0)
-0.397
(1,1)
0.918
The graph of Example 3 for the centroid of each rectangle.
The graph of Example 3 for the upper-right point coordinate of each rectangle.
The graph of Example 3 for the lower-right point coordinate of each rectangle.
The graph of Example 3 for the lower-left point coordinate of each rectangle.
The graph of Example 3 for the upper-left point coordinate of each rectangle.
The graph of Example 3 for the coordinate (s,t)=(1/3,3/4) of interior corresponding point of each rectangle.
There are different coordinates (s,t) in rectangles A, B, and C.
Theorem 4.
If f is continuous on a closed, bounded region, then f has a maximum value and a minimum value on the region. These extrema occur either (1) where all first partial derivatives of f are zero, (2) where some first partial derivative of f does not exist, or (3) on the boundary of the region.
Proof.
See [18].
3. Case Studies
In this section, we discuss some cases of fuzzy correlation coefficients. First, we analyze a case in which maximal value 1 or minimal values −1 of the correlation coefficient occur for the closed region bounded by Ω. Second, we analyze a case in which maximal value 1 or minimal values −1 of the correlation coefficient do not occur for the closed region bounded by Ω.
Case 1.
Maximal value 1 or minimal values −1 of the correlation coefficient occur for the closed region bounded by Ω.
Example 5.
Consider the rectangle sample data x1⊗y1=[0,2]⊗[0,2], x2⊗y2=[2,6]⊗[2,6], and x3⊗y3=[6,8]⊗[6,8], as shown in Figure 12.
Based on the previous discussion, three point coordinates, F1(2s,2t), F2(2+4s,2+4t), and F3(6+2s,6+2t), are obtained. We then find the sample means x¯=8+8s/3 and y¯=8+8t/3, respectively. Therefore, the correlation coefficient function between x and y is (4)rxy=∑i=1nxi-x¯yi-y¯∑i=1nxi-x¯2∑i=1nyi-y¯2=fs,t=14-s+-1+2st27-s+s27-t+t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (5)fss,t=27t-27s27-t+t2·7-s+s23/2,fts,t=27s-27t27-s+s2·7-t+t23/2.
Let fs(s,t)=ft(s,t)=0; it follows that s=t. Infinitely many critical points are found for the equation s=t bounded by Ω. For example, there are two points, (0,0) or (1,1).
The boundary of the region consists of the lines s=0, s=1, t=0, and t=1. Consideration of extrema on the boundary of the region along s=0 leads to the function f(0,t)=14-t/277-t+t2, 0≤t≤1. Setting the derivative with respect to t equal to zero gives the point (0,0). The endpoints are (0,0) and (0,1). Consideration of extrema on the boundary of the region along s=1 leads to the function f(1,t)=13+t/277-t+t2, 0≤t≤1. Setting the derivative with respect to t equal to zero gives the point (1,1). The endpoints are (1,0) and (1,1). Consideration of extrema on the boundary of the region along t=0 leads to the function f(s,0)=14-s/277-s+s2, 0≤s≤1. Setting the derivative with respect to t equal to zero gives the point (0,0). The endpoints are (0,0) and (1,0). Consideration of extrema on the boundary of the region along t=1 leads to the function f(s,1)=13+s/277-s+s2, 0≤s≤1. Setting the derivative with respect to t equal to zero gives the point (1,1). The endpoints are (0,1) and (1,1). All candidates for the maximum and minimum values are listed in Table 2. We see that the minimum value is f(0,1)=0.929; the maximum value is f(1,1)=1. Therefore, the fuzzy interval number of the correlation coefficient is [0.929,1].
In this case, the center points of these three rectangles are positively correlated, and rxy=1. Moreover, these three rectangles are approximately symmetric to the straight line y=x. Hence, the tendency of positive correlation of these three rectangles is high. In other words, the fuzzy correlation coefficient rxy may have a smaller range.
(s,t)
f(s,t)
(0,0)
1
(0,1)
0.929
(1,0)
0.929
(1,1)
1
The graph of Example 5.
Example 6.
Consider the rectangle sample data x1⊗y1=[1,3]⊗[1,2], x2⊗y2=[3,5]⊗[0,3], and x3⊗y3=[3,9]⊗[3,4], as shown in Figure 13.
Based on the previous discussion, three point coordinates, F1(1+2s,1+t), F2(3+2s,3t), and F3(3+6s,3+t), are obtained. We then find the sample means x¯=7+10s/3 and y¯=4+5t/3, respectively. Therefore, the correlation coefficient function between x and y is (6)rxy=fs,t=1+10s+2-4st21+2s+4s27-8t+4t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (7)fss,t=9-6t+6-12ts27-8t+4t2·1+2s+4s23/2,fts,t=9+6s+-6-12st1+2s+4s2·7-8t+4t23/2.
Let fs(s,t)=ft(s,t)=0; it follows that (2t-1)(2s+1)=2. Infinitely many critical points can be found for equation (2t-1)(2s+1)=2 bounded by Ω. For example, there are two points, (1/2,1) or (1,5/6).
The boundary of the region consists of the lines s=0, s=1, t=0, and t=1. Consideration of extrema on the boundary of the region along s=0 leads to the function f(0,t)=1+2t/27-8t+4t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (0,0) and (0,1). Consideration of extrema on the boundary of the region along s=1 leads to the function f(1,t)=11-2t/277-8t+4t2, 0≤t≤1. Setting the derivative with respect to t equal to zero gives the point (1,5/6). The endpoints are (1,0) and (1,1). Consideration of extrema on the boundary of the region along t=0 leads to the function f(s,0)=1+10s/271+2s+4s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,0) and (1,0). Consideration of extrema on the boundary of the region along t=1 leads to the function f(s,1)=3+6s/231+2s+4s2, 0≤s≤1. Setting the derivative with respect to s equal to zero gives the point (1/2,1). The endpoints are (0,1) and (1,1). All candidates for the maximum and minimum values are listed in Table 3. We see that the minimum value is f(0,0)=0.189; the maximum value is f(1,1)=5/6. Therefore, the fuzzy interval number of the correlation coefficient is [0.189,1].
In this case, the center points of these three rectangles are positively correlated, but 0<rxy<1. Moreover, these three rectangles are not symmetric to any straight lines. Hence, the tendency of positive correlation of these three rectangles is not evident. In other words, the fuzzy correlation coefficient rxy may be a large range.
(s,t)
f(s,t)
(1,5/6)
1
(1/2,1)
1
(0,0)
0.189
(0,1)
0.866
(1,0)
0.786
(1,1)
0.982
The graph of Example 6.
Case 2.
Maximal value 1 or minimal values −1 of the correlation coefficient do not occur for the closed region bounded by Ω.
Example 7.
Consider the rectangle sample data x1⊗y1=[0,2]⊗[0,2], x2⊗y2=[2,6]⊗[2,4], and x3⊗y3=[7,9]⊗[0,4], as shown in Figure 14.
Based on the previous discussion, three point coordinates, F1(2s,2t), F2(2+4s,2+2t), and F3(7+2s,4t), are obtained. We then find the sample means x¯=9+8s/3 and y¯=2+8t/3, respectively. Therefore, the correlation coefficient function between x and y is (8)rxy=fs,t=-3+4s+12-2st239-6s+4s21-t+t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (9)fss,t=147-42t+-42ts21-t+t2·39-6s+4s23/2,fts,t=21+-6-6st439-6s+4s2·1-t+t23/2.
Let fs(s,t)=ft(s,t)=0; it follows that t(2s+2)=7. There is no critical point for equation t(2s+2)=7 bounded by Ω. The reason is as follows: If 0≤t≤1, then we obtain s≥5/2, and if 0≤s≤1, then we obtain 7/4≤t≤7/2. Hence, the critical points for equation t(2s+2)=7 do not belong to the set Ω.
The boundary of the region consists of the lines s=0, s=1, t=0, and t=1. Consideration of extrema on the boundary of the region along s=0 leads to the function f(0,t)=-3+12t/2391-t+t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (0,0) and (0,1). Consideration of extrema on the boundary of the region along s=1 leads to the function f(1,t)=1+10t/2371-t+t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (1,0) and (1,1). Consideration of extrema on the boundary of the region along t=0 leads to the function f(s,0)=-3+4s/239-6s+4s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,0) and (1,0). Consideration of extrema on the boundary of the region along t=1 leads to the function f(s,1)=9+2s/239-6s+4s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,1) and (1,1). All candidates for the maximum and minimum values are listed in Table 4. We see that the minimum value is f(0,0)≈-0.240; the maximum value is f(1,1)≈0.904. Therefore, the fuzzy interval number of the correlation coefficient is [-0.240,0.904].
Based on the scatter plots of Examples 6 and 7, we intuitively think that the scatter plot of Example 7 is more dispersed than the scatter plot of Example 6. Therefore, the fuzzy correlation coefficient rxy of Example 7 will have a larger range.
(s,t)
f(s,t)
(0,0)
-0.240
(0,1)
0.721
(1,0)
0.082
(1,1)
0.904
The graph of Example 7.
Example 8.
Consider the rectangle sample data x1⊗y1=[0,2]⊗[0,2], x2⊗y2=[2,6]⊗[2,4], x3⊗y3=[7,9]⊗[0,4], and x4⊗y4=[9,13]⊗[4,6], as shown in Figure 15.
Based on the previous discussion, four point coordinates, F1(2s,2t), F2(2+4s,2+2t), F3(7+2s,4t), and F4(9+4s,4+2t), are obtained. We then find the sample means x¯=9+6s/2 and y¯=3+5t/2, respectively. Therefore, the correlation coefficient function between x and y is (10)rxy=fs,t=13+6s+5-2st53+8s+4s211-6t+3t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (11)fss,t=266-126t+-28-28ts11-6t+3t2·53+8s+4s23/2,fts,t=94-4s+-54-12st53+8s+4s2·11-6t+3t23/2.
First, let fs(s,t)=ft(s,t)=0; it follows that t=47-2s/27+6s=133-14s/63+14s. We obtain (s,t)=(5/2,1). This critical point for equation t=47-2s/27+6s=133-14s/63+14s does not belong to the set Ω. Therefore, no local maximum or minimum is in Ω.
Second, the boundary of the region consists of the lines s=0, s=1, t=0, and t=1. Consideration of extrema on the boundary of the region along s=0 leads to the function f(0,t)=13+5t/5311-6t+3t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (0,0) and (0,1). Consideration of extrema on the boundary of the region along s=1 leads to the function f(1,t)=19+3t/6511-6t+3t2, 0≤t≤1. There is no critical point when setting the derivative with respect to t equal to zero. The endpoints are (1,0) and (1,1). Consideration of extrema on the boundary of the region along t=0 leads to the function f(s,0)=13+6s/1153+8s+4s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,0) and (1,0). Consideration of extrema on the boundary of the region along t=1 leads to the function f(s,1)=18+4s/2253+8s+4s2, 0≤s≤1. There is no critical point when setting the derivative with respect to s equal to zero. The endpoints are (0,1) and (1,1). All candidates for the maximum and minimum values are listed in Table 5. We see that the minimum value is f(0,0)≈0.538; the maximum value is f(1,1)≈0.965. Therefore, the fuzzy interval number of the correlation coefficient is [0.538,0.965].
Comparing the scatter plots of Examples 7 and 8, we intuitively think that the scatter plot of Example 8 is more concentrated and has a greater tendency of positive correlation than that of the scatter plot of Example 7. Moreover, when fuzzy interval data of the scatter plot increase, the fuzzy correlation coefficient will have a smaller range.
Lin et al. [17] proposed the formula of the fuzzy correlation coefficient rxy as the following four situations: (12)rxy=rc,min1,rc+δifrc≥0,rl≥0rc-δ,rcifrc≥0,rl<0rc,rc+δifrc<0,rl≥0max-1,rc-δ,rcifrc<0,rl<0,where rc is the correlation coefficient of the center point of each rectangle, rl is the correlation coefficient of the interval lengths lxi and lyi of each of the fuzzy interval numbers xi and yi, and δ=1-ln(1+rl)/rl.
Next, the four scatter plots are observed as follows.
Intuitively, the degrees of spread of the four scatter plots (refer to Figures 16, 17, 18, and 19) do not seem to be the same. Hence, the fuzzy correlation coefficient should not be equal. But the formula (12) of Wu et al. [16] shows that the four fuzzy correlation coefficients are equal, and rxy=[1,1].
However, our proposed method obtains different results. The four fuzzy correlation coefficients (refer to Figures 16, 17, 18, and 19) obtained through our proposed method are 1,1,0.976,1,[0.922,1], and [0.857,1], respectively. Therefore, our proposed method produces results that are more consistent with our intuition.
(s,t)
f(s,t)
(0,0)
0.538
(0,1)
0.874
(1,0)
0.711
(1,1)
0.965
The graph of Example 8.
Scatter plot 1.
Scatter plot 2.
Scatter plot 3.
Scatter plot 4.
4. Empirical Studies
In this section, we discuss some applications of fuzzy correlation coefficients. First, we analyze a case in which two data sets are fuzzy interval numbers. Second, we change the case to one in which one data set is a fuzzy interval number, and the other is a real number. Finally, we analyze a case in which both data sets are real numbers.
To understand the factors influencing mathematics achievement at a school, we investigate 10 students’ data.
Example 9.
Consider the rectangle sample data for 10 students: x1⊗y1=[80,90]⊗[1,1.5], x2⊗y2=[80,90]⊗[6.5,7], x3⊗y3=[60,80]⊗[4,5], x4⊗y4=[90,100]⊗[1.5,2.5], x5⊗y5=[40,70]⊗[16,17], x6⊗y6=[70,80]⊗[15,16], x7⊗y7=[60,80]⊗[15,17], x8⊗y8=[80,100]⊗[1,3], x9⊗y9=[80,90]⊗[0,3], and x10⊗y10=[80,90]⊗[4.5,5.5], where xi and yi denote the mathematics score and weekly online time, respectively, of a student i, i=1,2,…,10, as shown in Figure 20.
Based on the previous discussion, the correlation coefficient function between x and y is (13)rxy=fs,t=-639+197.5s+4+5st10196-160s+45s2372.7-14.2t+5.54t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (14)fss,t=-12410+1300t+12955-580ts10372.7-14.2t+5.54t2·196-160s+45s23/2,fts,t=-3046.1+3265.75s+3511.66-1129.65st10196-160s+45s2·372.7-14.2t+5.54t23/2.
First, let fs(s,t)=ft(s,t)=0; it follows that (s,t)≈(3.239,51.066)∉Ω or (3.239,51.066)∉Ω. Therefore, no local maximum or minimum is in Ω.
Second, based on the previous discussion, the fuzzy interval number of the correlation coefficient is [-0.804,-0.748].
Clearly, x and y have a highly negative correlation. In other words, a higher mathematics score correlates with a lower weekly online time. The weekly online time of a student negatively influences the student’s mathematics score.
The graph of Example 9.
Example 10.
If Example 9 is adjusted to x1⊗y1=[85,85]⊗[1,1.5], x2⊗y2=[85,85]⊗[6.5,7], x3⊗y3=[70,70]⊗[4,5], x4⊗y4=[95,95]⊗[1.5,2.5], x5⊗y5=[55,55]⊗[16,17], x6⊗y6=[75,75]⊗[15,16], x7⊗y7=[70,70]⊗[15,17], x8⊗y8=[90,90]⊗[1,3], x9⊗y9=[85,85]⊗[0,3], and x10⊗y10=[85,85]⊗[4.5,5.5] (i.e., xi is a real number, i=1,2,…,10), then the correlation coefficient function between x and y is (15)rxy=fs=0.5,t=-540.25+6.5t11.2810372.7-14.2t+5.54t2,bounded by the set {t∣0≤t≤1}.
The first-order derivative with respect to t is (16)fts=0.5,t=-1413.23+2946.84t11.2810·372.7-14.2t+5.54t23/2.
Let ft(s=0.5,t)=0; it follows that t≈0.48, which is a critical point bounded by the set {t∣0≤t≤1}. Based on the previous discussion, the fuzzy interval number of the correlation coefficient is [-0.786,-0.784].
Example 11.
If Example 9 is adjusted to x1⊗y1=[85,85]⊗[1.25,1.25], x2⊗y2=[85,85]⊗[6.75,6.75], x3⊗y3=[70,70]⊗[4.5,4.5], x4⊗y4=[95,95]⊗[2,2], x5⊗y5=[55,55]⊗[16.5,16.5], x6⊗y6=[75,75]⊗[15.5,15.5], x7⊗y7=[70,70]⊗[16,16], x8⊗y8=[90,90]⊗[2,2], x9⊗y9=[85,85]⊗[1.5,1.5], and x10⊗y10=[85,85]⊗[5,5] (i.e., both xi and yi are real numbers, i=1,2,…,10), then the correlation coefficient between x and y is rxy=f(s=0.5,t=0.5)=-0.786.
According to the results of Examples 9, 10, and 11, we find that Example 9 is the generalized situation of Examples 10 and 11.
Example 12.
Consider the rectangle sample data of 10 students: x1⊗y1=[80,90]⊗[8,8.5], x2⊗y2=[80,90]⊗[7,7.5], x3⊗y3=[60,80]⊗[9,10.5], x4⊗y4=[90,100]⊗[8,8.5], x5⊗y5=[40,70]⊗[6,7.5], x6⊗y6=[70,80]⊗[10,11], x7⊗y7=[60,80]⊗[7,8], x8⊗y8=[80,100]⊗[8,10], x9⊗y9=[80,90]⊗[6.5,8], and x10⊗y10=[80,90]⊗[7.5,8.5], where xi and yi denote the mathematics score and weekly sleeping time, respectively, of a student i, i=1,2,…,10, as shown in Figure 21.
Based on the previous discussion, the correlation coefficient function between x and y is (17)rxy=fs,t=36-25s+-27+20st10196-160s+45s212.6-0.9t+2.4t2,for the closed region bounded by Ω.
The first-order derivatives for s and t are, respectively, (18)fss,t=-2020+1760t+380-385ts1012.6+0.9t+2.4t2·196-160s+45s23/2,fts,t=-648+481.5s+-148.5+102st210196-160s+45s2·12.6-0.9t+2.4t23/2.
First, let fs(s,t)=ft(s,t)=0; it follows that (s,t)≈(4.697,-4.881)∉Ω or (1.368,1.216)∉Ω. Therefore, no local maximum or minimum is in Ω.
Second, based on the previous discussion, the fuzzy interval number of the correlation coefficient is [0.037,0.229].
Based on the previous discussion, there is a minor positive correlation between mathematics score and weekly sleeping time. Therefore, a higher mathematics score correlates with a lower weekly sleeping time. The influence of weekly sleeping time on mathematics score is minor.
The graph of Example 12.
Example 13.
Consider the rectangle sample data of 10 students: x1⊗y1=[80,90]⊗[70,80], x2⊗y2=[80,90]⊗[80,90], x3⊗y3=[60,80]⊗[70,80], x4⊗y4=[90,100]⊗[80,90], x5⊗y5=[40,70]⊗[60,70], x6⊗y6=[70,80]⊗[60,80], x7⊗y7=[60,80]⊗[70,80], x8⊗y8=[80,100]⊗[80,90], x9⊗y9=[80,90]⊗[60,70], and x10⊗y10=[80,90]⊗[70,80], where xi and yi denote the mathematics and Chinese scores, respectively, of a student i, i=1,2,…,10, as shown in Figure 22.
Based on the previous discussion, the correlation coefficient function between x and y is (19)rxy=fs,t=60-10s+-2-5st196-160s+45s260-20t+9t2,for the closed region bounded by Ω.
The first-order derivatives with for s and t are, respectively, (20)fss,t=2840-1140t+-1900+490ts60-20t+9t2·196-160s+45s23/2,fts,t=480-400s+-520+140st196-160s+45s2·60-20t+9t23/2.
First, let fs(s,t)=ft(s,t)=0; it follows that (s,t)≈(8.324,4.416)∉Ω or (1.596,-0.534)∉Ω. Therefore, no local maximum or minimum is in Ω.
Second, based on the previous discussion, the fuzzy interval number of the correlation coefficient is [0.553,0.717].
Based on the previous discussion, x and y have a highly positive correlation. Therefore, a higher mathematics score correlates with a higher Chinese score. Students’ Chinese scores positively influence their mathematics scores.
The graph of Example 13.
5. Conclusion
Scientists are accustomed to using binary logic to analyze information. Human logic is fuzzy and complex, and applying binary logic to analyze human thought processes causes some distortion. Fuzzy logic is based on human thought processes, and fuzzy logic has therefore been increasing applied to social science.
Possible methods of calculating fuzzy correlation coefficients are proposed in the literature, but understanding most formulas used in the literature requires a strong mathematical background. In this paper, we use Pearson’s correlation coefficient and the differentiation method to evaluate fuzzy correlation coefficients, which can be applied to cases in which two data sets are fuzzy interval numbers, one of two data sets is a fuzzy interval number and the other is a real number, and both data sets are real numbers.
This paper discusses only fuzzy correlation coefficients of fuzzy interval number. However, we will extend the research method that we used to triangular or trapezoidal fuzzy numbers in the future.
Competing Interests
The authors declare that they have no competing interests.
Acknowledgments
This work of Berlin Wu is partially supported by the National Science Council of the Republic of China under Contract 102–2410-H-004-182.
WuB.NguyenH. T.KreinovichV.WuB.XiangG.NguyenH.WuB.HongD. H.HwangS. Y.Correlation of intuitionistic fuzzy sets in probability spacesZadehL. A.Fuzzy sets as a basis for a theory of possibilityChiangD.-A.LinN. P.Correlation of fuzzy setsChaudhuriB. B.BhattacharyaA.On correlation between two fuzzy setsHongD.Fuzzy measures for a correlation coefficient of fuzzy numbers under TW (the weakest t-norm)-based fuzzy arithmetic operationsNiY.CheungJ. Y.Correlation coefficient estimate for fuzzy dataLiuS.-T.KaoC.Fuzzy measures for correlation coefficient of fuzzy numbersXieM. C.WuB.The relationship between high schools students time management and academic performance: an application of fuzzy correlationYangC. C.Correlation coefficient evaluation for the fuzzy interval dataSaneifardR.SaneifardR.Correlation coefficient between fuzzy numbers based on central intervalChengY. T.YangC. C.The application of fuzzy correlation coefficient with fuzzy interval dataHanafyI. M.SalamaA. A.MahfouzK. M.Correlation coefficients of neutrosophic sets by centroid methodWuB.LaiW.WuC. L.TienliuT. K.Correlation with fuzzy data and its applications in the 12-year compulsory education in TaiwanLinH.WangC.ChenJ. C.WuB.New statistical analysis in marketing research with fuzzy dataHuntR. A.