An Application of Fuzzy Multiple Linear Regression in Biological Paradigm

Department of Mathematics and Statistics, PMAS Arid Agriculture University, Rawalpindi, Pakistan Institute of Numerical Sciences, Kohat University of Science and Technology, Kohat, Pakistan Division of Continuing Education, PMAS-Arid Agriculture University, Rawalpindi, Pakistan School of Mathematics, )apar Institute of Engineering and Technology, Deemed University, Patiala 147004, Punjab, India Department of Mathematics and Computer Science, University of Agadez, Agadez, Niger


Introduction
Regression is a method for determining the statistical relationship between two or more variables where a change in a dependent variable is associated with a change in one or more independent variables [1]. Multiple linear regression describes the relationship of one dependent variable with more than one independent variable.
is is a statistical technique that is used in examining how multiple independent variables and dependent variables are related. Certain assumptions need to be fulfilled for achieving better results from multiple regression such as linearity, normality, no multicollinearity, and homoscedasticity. Fuzzy set theory was first introduced by Zadeh [2] in 1965, and it is a technique used to handle vague, uncertain, imprecise, or unclear information. is technique is appropriate in the case of vague information [1]. Many recent developments of fuzzy and its applications have been explored by different researchers [3][4][5]. ey have applied different fuzzy tools, and the applications of these tools have been explored in different fields such as decision-making and logistics processes. Fuzzy regression was proposed by the Japanese researcher Tanaka (1982) [6]. e model of fuzzy linear regression has been treated from diverse points of view which depend upon the type of data given in the input and data try to be achieved from the output [7]. Fuzzy regression can be used in very complex systems in the real world such as economy, marketing, finance, ecology, and industry, see [7]. Dengue fever is a viral illness caused by mosquito bites that are responsible for infecting approximately 96 million infected people in America yearly [8]. In recent decades, dengue has increased in geographic incidence and distribution. e impact of climatic factors on transmission was investigated by researchers in [9,10]. Zadeh et al. [2] gave the idea of fuzzy set theory to deal with the vagueness and uncertainties occurring in decision making. It is one of the applications which is in multisensing paradigm-based urban air quality monitoring and hazardous gas source analyzing [11]. Multiple linear regressions are often used to explain the relationship of multiple independent variables. is method is particularly appropriate for disease models and is widely used in the health sciences.In this paper, we discuss the problems that occur in different situations of multiple linear regression. First, it involves a linear relationship between the independent and dependent variables which do not give clear results based on the given information. Second, it specifies the distribution of errors normally distributed between the observed and expected values which cause impreciseness.
ird, it assumes the data do not contain multicollinearity. Another difficulty can be caused by nonpreciseness and vague observations that occur frequently in practice. Due to these facts, we used the term "fuzzy" in multiple linear regression which overcome all these difficulties. is paper focuses on multiple linear regressions in a biological paradigm. Here is the numerical computation depending on the dengue study.

Fuzzy Sets and Numbers
e fuzzy set is defined as a set that contracts with vague boundaries [2]. e set consists of fuzzy logic and linguistic variables. Fuzzy sets are represented as where μ 1 and n express the membership function of X i , i � 1, 2, . . . , n in A, and the union is denoted by the plus sign [12]. Fuzzy sets are scientific, mathematical models of unclear quantitative or qualitative data, as often as possible, which are generated from natural languages. e model depends upon the generalization of the characteristics function of a set and the classical concept.

Multiple Linear Regression Model
Sir Francis Galton [1], an English Victorian, introduced the term regression. e general parametric equation is where Y and X represent the dependent and independent variables. e coefficients τ 1 , τ 2 , . . . , τ q represent slopes, and ϵ is the random error. A fuzzy regression technique was first proposed by Tanaka [6]. We have considered the following cases of dependent and independent variables.

Case 1
Case 2 Case 3 In the above models, here, τ 0 and τ 1 are the intercept and slope of the regression line respectively. Z i are the fuzzy responses. In case 1, the parameters τ 0 and τ 1 are crisp parameters, and K i are fuzzy. In Case 2, the parameters τ 0 and τ 1 are fuzzy but K i are crisp. In case 3, the predictor and parameters are all fuzzy.
Consider the multiple fuzzy regression model which can be generalized as follows: Using the centered values of the crisp predictor, the above equation can be written in matrix form as follows: where y is a (n × 1) fuzzy vector, x is a (n × p) matrix of p fuzzy predictors, and τ is a (p × 1) vector of unknown p fuzzy parameters. As a result of the lack of linearity of F c (R p ), ϵ is reduced to nonfuzzy random variable (FRV)ϵ.
where y ja , y jb , y jc are left, middle, and right values, respectively. e τ ja is as follows: On the same lines, the above equations can be simplified as Consider the multiple fuzzy regression model is generalized as follows: where y is a (n × 1) fuzzy vector, x is a (n × p) matrix of p crisp predictors, and τ is a (p × 1) vector of unknown p fuzzy parameters. As a result of the lack of linearity of F c (R p ), ϵ is reduced to a nonfuzzy random variable (FRV)ϵ.
2 Complexity e τ ja is as follows: On the same lines above, the equation is simplified as

Numerical Computation
is research depends on the primary data of dengue fever patients collected from two public sector hospitals in Rawalpindi named (Benazir, Holyfamily), respectively. After taking the data, we applied principal component analysis to eliminate insignificant variables and consider those variables that have a significant impact on our study. We have implemented the Keiser-Meyer-Olkin and Bartlett's test for the reduction of data. If the p value is greater than 0.05, then this is normal. Table 1 indicates that all the p values are greater than 0.05, so we conclude that our data are normal. Figure 1 indicates that all the p values are greater than 0.05 so we concluded that our data are normal.

Multicollinearity.
Here, using the VIF (variance inflection factor) values, we have seen that each value of VIF is below 10, and the assumption is fulfilled. e multiple linear regression results are explained in Table 2 and represented by Figure 2

Interpretation.
e estimated values in Table 3 represent the regression coefficient which shows that the values of dengue fever increased by 0.043 units for one element increase in age, decreased 0.104 units for a unit increase in suffering fever, decreased 0.004 units for one element add-in checkup, decreased by 0.045 units for the unit increase in BP (U), and decreased 0.088 units for one unit add-in BP (L).
ese results are represented in Figure 2 which shows the best fitted classical multiple regression model.  Table 4 and represented by Figure 3.

Interpretation.
e estimated fuzzy regression coefficient shows that rates of Y increase by 0.037 units for one element add-in X 1 , 0.114 divisions on behalf of the unit increase in X 2 , 0.014 entities for the unit increase in X 3 , 0.064 units for the unit increase in X 4 , and increased 0.065 units for the unit rise in X 5 .

Case 1. Fuzzy multiple linear regression model with fuzzy independent variables and crisp dependent variable.
e multiple linear regression results are explained in Table 5 and represented by Figure 4.

Interpretation.
e estimated fuzzy regression coefficient indicates that the value of Y increases by 0.200836 units for a unit increase in X 1 , 0.277264 units for the unit rise in X 2 , 0.024512 units for the unit increasingly in X 3 , decreases by 0.09249 units for the unit increase in X 4 , and increases 0.43941 units for the unit increase in X 5 .

Performance Comparison
e following section describes the performance of multiple linear regression and fuzzy multiple linear regression.   It is important to know which method from classical multiple linear regression and fuzzy multiple linear regression method is performing best and gives significant results. Comparison between these methods is made by using different evaluation criteria. Results obtained by using different evaluation techniques are given in Table 6. e empirical analysis shows that the MSE, RMAE, BIC, and RAE in the case of fuzzy multiple linear regression are all    smaller than the classical multiple linear regression method. is shows that fuzzy multiple regression can smooth the defuzzified forecast, and it is more consistent.
It is important to know which method is performing best and gives significant results among classical multiple linear regression and fuzzy multiple linear regression method. Comparison between these methods is evaluated by using different evaluation criteria. Results obtained by using different evaluation techniques are given in Table 7. e empirical analysis shows that the MSE, RMAE, BIC, and RAE in the case of fuzzy multiple linear regression are all smaller than the classical multiple linear regression method. is shows that fuzzy multiple regression can smooth the defuzzified forecast, and it is more consistent.
We have compared the fuzzy regression results of triangular and trapezoidal membership functions in Figure 4 which shows that the triangular membership function results were lower than that of trapezoidal so the triangular membership function is more efficient compared to the trapezoidal membership function and can be used in further studies for comparison.

Conclusion
ere are numeral classical methods that are used to distribute accurate information, but in a lot of circumstances, accurate quantities cannot be achieved. is paper is based on the basic idea of the multiple linear regression method    and fuzzy multiple regression method. In the proposed work, the fuzzy multiple regression method is done by using the triangular and trapezoidal membership function and evaluating the computation of dengue fever data to express the efficiency of the proposed fuzzy multiple regression with the existing multiple regression model. e realistic result of the mean square error (MSE), Bayesian information criteria (BIC), root absolute error (RAE), and root mean square error (RMSE) of fuzzy multiple regression with triangular and trapezoidal membership function is smaller compared to the multiple linear regression which indicates that the proposed method has a better performance as compared to multiple regression [13][14][15][16][17].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.