JAMCJournal of Analytical Methods in Chemistry2090-88732090-8865Hindawi10.1155/2020/14060281406028Research ArticleMultivariate Analysis under Indeterminacy: An Application to Chemical Content Datahttps://orcid.org/0000-0003-0644-1950AslamMuhammad1ArifOsama H.12MontuoriPaolo1Department of StatisticsFaculty of ScienceKing Abdulaziz UniversityJeddah 21551Saudi Arabiakau.edu.sa2Department of MathematicsFaculty of ScienceJouf UniversitySakakahSaudi Arabiaju.edu.sa202011720202020301220192632020176202011720202020Copyright © 2020 Muhammad Aslam and Osama H. Arif.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The Hotelling T-squared statistic has been widely used for the testing of differences in means for the multivariate data. The existing statistic under classical statistics is applied when observations in multivariate data are determined, precise, and exact. In practice, it is not necessary that all observations in the data are determined and precise due to measurement in complex situations and under uncertainty environment. In this paper, we will introduce the Hotelling T-squared statistic under neutrosophic statistics (NS) which is the generalization of classical statistics and applied under uncertainty environment. We will discuss the application and advantage of the neutrosophic Hotelling T-squared statistic with the aid of data. From the comparison, we will conclude that the proposed statistic is more adequate and effective in uncertainty.

King Abdulaziz University
1. Introduction

In classical statistics (CS), the univariate analysis is the technique to analyze the single-variable data. The multivariate analysis has been widely used to analyze data having more than one variable. In the multivariate technique under the CS, the Hotelling T-squared statistic has been widely applied in the variety of fields (see, for example, [1, 2]), for the testing either the means for more than one populations are equal or not. This statistic is the extension of the t-test, which is applied for the testing of the mean for the single population. Brereton  used the Hotelling T-squared statistic to detect the outlier in chemical data. In , Varmuza and Filzmoser worked on multivariate analysis for chemometric data. Hervé et al.  applied the multivariate technique on biological data. Kitaga ki et al.  used Hotelling T-squared statistic in chemical and electrochemical oscillator issues. For more details about the applications of the Hotelling T-squared statistic, the reader may read [3, 7] and .

The Hotelling T-squared statistic derived under the CS can be only applied for the analysis when all observations in the multivariate data are determined, precise, and certain. In practice, the data under study are not always precise but linguistic. For example, the temperature of a certain city may be high, low, and medium or the measurement of variable data in a complex system may lead to being in an interval rather than the determined values. In such situations, the Hotelling T-squared statistic under the CS cannot be used for the analysis of the data. When observations are uncertain or fuzzy, the fuzzy Hotelling T-squared statistic can be applied for the testing of means of multivariate populations. Taleb et al.  applied the fuzzy Hotelling T-squared statistic to design a control chart. D’Urso  provided a review on fuzzy multivariate analysis. Bakdi and Kouadri  presented a new adoptive principle component analysis technique to detect fault in a complex system. In , Ammiche et al. introduced principle component analysis for the Tennessee Eastman process using a fuzzy approach. More applications can be read in .

Recently, the neutrosophic logic, which is the extension of the fuzzy logic, attracted many researchers due to its applications in the variety of fields. The neutrosophic logic considered the measure of indeterminacy which fuzzy logic does not consider (see ). The neutrosophic statistics (NS) which is based on the neutrosophic numbers is the generalization of the CS (see [17, 18]). The NS has been applied widely in the rock-measuring issues (see, for example, [19, 20]). The application of the NS for the inspection of the product can be seen in [21, 22]. The applications of the NS in the area of the process control can be seen in [23, 24]. The application of the NS in medical can be read in . For more information on neutrosophic theory, the reader may refer to [26, 27].

Aslam and Smarandache [17, 18] pointed out some suggestions to extend the several concepts of CS to the NS. By exploring the literature and best of our knowledge, there is no work on the development of Hotelling T-squared statistic under the NS. In this paper, we will introduce the Hotelling T-squared statistic under NS, which is the generalization of classical statistics and applied under uncertainty environment. We will discuss the application and advantage of neutrosophic Hotelling T-squared statistic with the aid of data. We expect that the proposed neutrosophic Hotelling T-squared statistic will perform better than the existing Hotelling T-squared statistic in uncertainty.

2. Preliminaries

Let xjkNxjkL,xjkU be a neutrosophic random variable, which represents the particular neutrosophic observation of the kth variable that is noted from the jth item. Note here that xjkNxjkL,xjkU is expressed in the indeterminacy interval having the smaller value xjkLand the larger value xjkU. The neutrosophic form of xjkNxjkL,xjkU having determinate part xjkL and indeterminate part xjkUIN;INIL,IU can be written as follows: xjkN=xjkL+xjkUIN;INIL,IU. Note here that the neutrosophic random variable reduces to the variable under classical statistics if no indeterminacy is recorded in the data. The neutrosophic data matrix having nNnL,nU neutrosophic observations of pNpL,pU neutrosophic variables is given as follows:(1)XNϵx11Lx1kLx1pLxj1Lxj1LxjpLxn1LxnkLxnpL,x11Ux1kUx1pUxj1Uxj1UxjpUxn1UxnkUxnpU;XNϵXL,XU.

The neutrosophic form of XNϵXL,XU can be written as(2)XN=XL+XUIN;INϵIL,IU.

Note here that XNϵXL,XU is the generalization of the data matrix under classical statistics. The data matrix under XNϵXL,XU reduces to the data matrix under classical statistics when IL = 0.

The neutrosophic sample mean and neutrosophic sample variance from nN measurements from pN neutrosophic variables are computed as follows:(3)x¯kNϵ1nLj=1nLxjkL,1nUj=1nUxjkU;x¯kNϵx¯kL,x¯kU.

The neutrosophic form of x¯kNϵx¯kL,x¯kU can be written as(4)x¯kN=x¯kL+x¯kUIN;INϵIL,IU.

Note here that x¯kNϵx¯kL,x¯kU is the generalization of the sample mean under classical statistics. The data matrix under x¯kNϵx¯kL,x¯kU reduces to the sample mean under classical statistics when IL = 0:(5)skN2ϵ1nLj=1nLxjkLx¯kL2,1nUj=1nUxjkUx¯kU2;skN2ϵskL2,skU2.

The neutrosophic form of skN2ϵskL2,skU2 can be written as(6)skN2=skL2+skU2IN;INϵIL,IU.

Note here that skN2ϵskL2,skU2 is the generalization of sample variance under classical statistics. The data matrix under skN2ϵskL2,skU2 reduces to the sample variance under classical statistics when IL = 0.

The neutrosophic sample covariance between two neutrosophic variables are given by(7)SikNϵ1nLj=1nLxjiLx¯kLxjkLx¯kL,1nUj=1nUxjiUx¯kUxjkUx¯kU;SikNϵSikL,SikU.

The neutrosophic form of SikNϵSikL,SikU can be written as(8)SikN=SikL+SikUIN;INϵIL,IU.

Note here that SikNϵSikL,SikU is the generalization of sample covariance under classical statistics. The data matrix under SikNϵSikL,SikU reduces to the sample covariance under classical statistics when no indeterminate observations.

Finally, neutrosophic sample correlation between the ith and kth variables is given by(9)rikNϵSikLsiiLskkL,SikUsiiUskkU;rikNϵrikL,rikU,i,k=1,2,3,,pN.

The neutrosophic form of rikNϵrikL,rikU can be written as(10)rikN=rikL,rikUIN;INϵIL,IU.

Note here that rikNϵrikL,rikU is the generalization of sample correlation under classical statistics. The data matrix under rikNϵrikL,rikU reduces to the sample correlation under classical statistics when no indeterminate observations.

The neutrosophic descriptive statistics for nN measurements and on pN variables can be presented into the following arrays.

The neutrosophic sample mean variance and covariance and correlation are presented by the array X¯Nϵx¯1Lx¯pL,x¯1Ux¯pU,SNϵs11Ls1kLs1pLsj1Lsj1LsjpLsn1LsnkLsnpL,s11Us1kUs1pUsj1Usj1UsjpUsn1UsnkUsnpU andRNϵ1r12Lr1pLr21L1s2pLspL1LspL2L1,1r12Ur1pUr21U1s2pUspU1UspU2U1,respectively.

3. Neutrosophic Hotelling <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M48"><mml:msubsup><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:math></inline-formula> Statistic

In this section, we discuss the proposed neutrosophic Hotelling TN2 statistic. In classical statistics, the student t-test is applied for the testing of the mean for the univariate case. As mentioned by , rejecting the null hypothesis that means are equal when tNis large is the same as rejecting the null hypothesis of its square:(11)tN2=X¯Lμ0L2sL2/nL,X¯Uμ0U2sU2/nU=nLX¯Lμ0L2sL21,nUX¯Uμ0U2sU21X¯Uμ0U2;tN2ϵtL2,tU2.

The neutrosophic form of tN2ϵtL2,tU2 can be written as(12)tN2=tL2,tU2IN;INϵIL,IU.

Note here that tN2ϵtL2,tU2 is the generalization of Hotelling T-squared statistic under classical statistics. The data matrix under tN2ϵtL2,tU2 reduces to the Hotelling T-squared statistic under classical statistics when no indeterminate observations.

For the given values of x¯kNϵx¯kL,x¯kU and SkN2ϵSkL2,SkU2, the null hypothesis will be rejected if(13)tN2ϵtL2,tU2>tnU12α2,where α is the level of significance and tnU1α/2 is upper 100α/2th percentiles of the neutrosophic t-distribution with the neutrosophic degree of freedom nU1.

The generalization of equations (1) and (2) for the multivariate case under the neutrosophic statistical interval method (NSIM) is given by(14)TN2=X¯Lμ0L1nLSL1X¯Lμ0L,X¯Uμ0U1nUSU1X¯Uμ0U;TN2TL2,TU2,where X¯NpN×1=1/nNj=1nNXjN,SNPN×pN=1/nN1j=1nNXjNX¯N,andμ0NpN×1ϵμ1NμpN;X¯NpN×1ϵX¯LpN×1,X¯UpN×1,SNpN×pNϵSLpN×pN,SUpN×pN,μ0Nϵμ0LpN×1,μ0UpN×1.

The neutrosophic form of TN2ϵTL2,TU2 can be written as(15)TN2=TL2+TU2IN;INϵIL,IU.

The statistic is given in equation (14) is called neutrosophic Hotelling TN2 statistic and has neutrosophic F-distribution with neutrosophic degree of freedom (ndf) pN and nNpN:(16)TN2nN1nNpNFpN,nNpN;TN2ϵTL2,TU2.

The neutrosophic Hotelling TN2 statistic can be used for the testing of hypothesis H0N:μN=μ0N and alternative hypothesis H0N:μNμ0N. The H0N:μN=μ0N will be rejected if(17)TN2TL2,TU2>nN1nNpNFpN,nNpN.

The software provides the p value in making a decision about the acceptance or the rejection of the null hypothesis. According to , “a neutrosophic p value is defined in the same way as in classical statistics: the smallest level of significance at which a null hypothesis H0 can be rejected.” Note here that the neutrosophic p value is not an exact or determined value as in the case of classical statistics. Smarandache  discussed criteria to accept or reject the null hypothesis using the neutrosophic p value.

4. Application

Now, we discuss the application of the proposed neutrosophic Hotelling TN2 statistic using data selected from the healthcare department. The data are collected from 20 healthy women and three variables, which are sweat rate, sodium, and potassium contents are measured. The observations of variables underinvestigated will be obtained from the measurement process. It is expected that not all observations in the data are precise and exact. Therefore, it cannot be analyzed using CS. Similar data for classical statistics are given by . The data having some neutrosophic observations are shown in Table 1. We want to test that the means of three groups for the healthy women have the same population means. We state null and alternative hypotheses as follows:

Step 1: H0N:μ0N=4,450,5010,10 vs H1N:μ1N4,450,5010,10.

Step 2: some basic calculations for the data are given in Table 1 are shown as(18)X¯N=4.64,4.7045.5,45.479.965,10.1,μ0N=4,450,5010,10,DN=X¯NμN=0.64,0.704.6,4.530.035,0.01,SN=2.8793,2.838910.01,10.041.8090,1.8284199.7884,200.715.64,5.723.6276,3.5830.

Step 3: let α=0.10,0.10 be the level of significance.

Step 4: the neutrosophic Hotelling TN2 statistic is(19)TN2=nLDLSL1DL,nUDUSU1DU,TN2=200.64,0.644.6,4.60.035,0.0352.8793,2.838910.01,10.041.8090,1.8284199.7884,200.715.64,5.723.6276,3.58300.64,0.644.6,4.60.035,0.035=9.7387,11.4176.

Step 5: the critical region is using equation (5) is given as(20)T0N0=201203,F0,10,3,17=8,17,8,17.

Step 6: as TN2=9.7387,11.4176>T0N0=8.17,8.17, we reject H0N:μ0N=4,450,5010,10.

The neutrosophic sweat data.

IndividualX1 (sweat level)X2 (sodium)X3 (potassium)
1[3.7, 3.7][48.5, 48.7][9.3, 9.3]
2[5.7, 5.8][65.1, 65.1][8.0, 8.1]
3[3.8, 3.8][47.2, 47.3][10.9, 10.9]
4[3.2, 3.3][53.2, 53.3][12.0, 12.0]
5[3.1, 3.1][55.5, 55.5][9.7, 9.8]
6[4.6, 4.8][36.1, 36.2][7.9, 7.9]
7[2.4, 2.4][24.8, 24.8][14.0, 14.0]
8[7.2, 7.3][33.1, 33.2][7.6, 7.7]
9[6.7, 6.7][47.4, 47.4][8.5, 8.6]
10[5.4, 5.5][54.1, 54.2][11.3, 11.3]
11[3.9, 3.9][36.9, 36.9][12.7, 12.7]
12[4.5, 4.6][58.8, 58.9][12.3, 12.4]
13[3.5, 3.5][27.8, 27.9][9.8, 9.8]
14[4.5, 4.5][40.2, 40.2][8.4, 8.5]
15[1.5, 1.7][13.5, 13.5][10.1, 10.2]
16[8.5, 8.5][56.4, 56.5][7.1, 7.1]
17[4.5, 4.7][71.6, 71.9][8.2, 8.2]
18[6.5, 6.5][52.8, 52.8][10.9, 10.9]
19[4.1, 4.2][44.1, 44.2][11.2, 11.3]
20[5.5,5.5][40.9, 40.9][9.4, 9.5]
5. Comparisons

In Section 4, we presented the testing procedure for the proposed neutrosophic Hotelling TN2. The proposed neutrosophic Hotelling TN2 is the generalization of CS. The proposed neutrosophic Hotelling TN2 testing procure reduces to the testing procedure under CS when all observations of sweat data are precise. From neutrosophic sweat data, we note that the proposed testing procedure provides the analysis values in the indeterminacy interval rather than the determined values. The neutrosophic form of proposed Hotelling statistic is TN2=9.738711.41IN;INϵ0,0.1470. For example, the proposed Hotelling statistic has the indeterminacy interval from 9.73 to 11.41. It means, under uncertainty environment, one can expect the values of TN2 from 9.73 to 11.41. The first value 9.73 of the indeterminacy interval of TN2 shows the determined part, and 11.41 is an indeterminate part. When imprecise observations are noted in the sweat data, the value of TN2 is 9.73 which is under the CS. In other words, when the level of significance is 5%, the probabilities that the null hypothesis is accepted, rejected, and indeterminate are 0.95, 0.50, and 0.1470. By comparing the proposed test with the test under CS, we note that the existing test is unable to tell about the probability of the indeterminacy. As mentioned by [19, 20] that a method that provides the values in an indeterminacy interval under uncertainty is considered as the most effective and adequate method. By comparing the proposed testing procedure with the existing under CS, our theory is the same as in [19, 20].

6. Concluding Remarks

In this paper, we introduced the Hotelling T-squared statistic under neutrosophic statistics (NS) which is the generalization of classical statistics and applied under uncertainty environment. We discussed the application and advantage of neutrosophic Hotelling T-squared statistic with the aid of data. The proposed neutrosophic Hotelling T-squared statistic is expressed in the indeterminacy interval and hence more flexible and information than the Hotelling T-squared statistic under classical statistics. Based on the comparison, we recommend using the proposed neutrosophic Hotelling T-squared statistic for the analysis of the data under uncertainty. Some more properties of the proposed neutrosophic Hotelling T-squared statistic can be studied as future research. The sensitivity of the proposed statistic to uncertainty and measurement errors can be studied in future work.

Data Availability

The data used to support the findings of this study are included in the paper.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding this paper.

Acknowledgments

This article was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah. The authors, therefore, acknowledge DSR technical and financial support with thanks.