Application of Multivariate Statistical Analysis in Evaluation of Surface River Water Quality of a Tropical River

1Department of Chemistry, Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia 2Department of Aquatic Science, Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia 3Institute of Biodiversity and Environmental Conservation, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia


Introduction
The Batang Baram ("batang" denotes big river) (coordinates: 4 ∘ 35  5.28  N and 113 ∘ 58  44.256  E) is located on the northern part of Sarawak where it flows 400 km westwards, mostly through primary and secondary forest to the South China Sea.The river is the second longest river in Sarawak and the third longest river in Malaysia.The Baram area was once a pristine area but it has undergone profound changes associated with the population growth and development.Increasing residential area, numerous longhouses, and swidden agriculture are found along the river.Commercial logging has also been carried out actively in the area for decades where the logged forest was then converted to commercial oil palm and acacia plantations [1].
Although development continues to grow in this area, the study on the water quality of the river is relatively scarce despite the river serving as an important source for drinking water for the rural community.The discharges of domestic sewage and agricultural runoff can lead to eutrophication [2][3][4] while deforestation can cause sedimentation and nutrient enrichment in the river [5][6][7][8].In the year 1995, sampling was conducted in the uppermost catchment of the Baram River basin.The study revealed that the overall water quality of the river was relatively good at that time but was subjected to high suspended solids which came from soil erosions due to land clearing and timber harvesting [9].The author also pointed out that elevated ammonia was found near to the domestic and animal waste discharges.
Water quality assessment and monitoring on large river basin like the Batang Baram potentially generate a large data set.Numerous studies have shown that multivariate statistical analysis is useful for the assessment of the spatial water quality variations in a river [10][11][12][13][14][15][16].Cluster analysis could reveal similarities among the large number of sampling stations in a river while principal component analysis assists in identifying important factors accounting for most of the variances in water quality of a river.Hence, the aim of the present study was to apply the multivariate statistical analysis in the interpretation of the physicochemical characteristics of the Batang Baram and its tributaries.The analysis output would provide valuable information for the decision making in the river basin management.

Field Collection.
In situ and ex situ parameters were collected at 30 sampling stations located along the Batang Baram and its tributaries covering a distance of approximately of 172 km (Figure 1).Table 1 shows the details of the samplings from upstream to downstream directions that were carried out in the year 2015.The water level of river was high during samplings due to the rain before each sampling.The whole study area was subjected to logging activities.Numerous longhouses and plantation activities were included in Table 1.In situ parameters including temperature, pH, conductivity, oxygen saturation (DOsat), dissolved oxygen (DO), and turbidity were measured using a multiparameter water quality sonde (YSI6920 V2-2).Transparency, depth, and flow velocity were measured using a Secchi disc with a measuring tape, a depth sounder (PS-7, Hondex), and a stream flow meter (Geopacks), respectively.Total discharge, mean velocity, and mean depth were calculated according to [17].The water samples were taken for the analyses of chlorophyll a (chl a), total suspended solids (TSS), fiveday biochemical oxygen demand (BOD 5 ), chemical oxygen demand (COD), total phosphorus (TP), total ammonia nitrogen (TAN), nitrite-nitrogen (NO 2 − -N), nitrate-nitrogen (NO 3 − -N), organic nitrogen (Org-N), and total sulphide (TS).All sampling bottles were acid-washed, cleaned, and dried before use.Analyses of chl a, TSS, and BOD 5 began in the field immediately after sampling while NO 2 − -N, NO 3 − -N, and TS analyses were completed in the field after sampling.Water samples were acidified to pH < 2 for COD, TP, TAN, and Org-N analyses.The samples were placed in an ice box and transported to the laboratory for further analysis [18].

Laboratory Analysis.
All the analyses were conducted according to the standard methods [16,17].Chl  was determined from samples filtered through a 0.7 m glass microfibre filter (Whatman GF/F) and extracted for 24 h using 90% (v/v) acetone.For TSS, filtration of an adequate sample through a 1.0 m glass microfibre filter (Whatman GF/B) was carried out in the field and drying of the filter was conducted in the laboratory in an oven at 105 ∘ C until a constant weight was obtained.It was then determined by calculating the difference between the initial and final weight of the sample and expressed as milligram per liter of sample.For BOD 5 , it was determined as the difference between the initial and final DO content, after a five-day period of incubation of the sample.The initial DO content was measured in the field.Whenever the in situ DO value was deemed too low, it was raised by vigorous aeration.COD was determined by the closed reflux method followed by the titrimetric method.For TP analysis, persulfate digestion of samples was conducted followed by the ascorbic acid method.TAN, NO 2 − -N, and NO 3 − -N were determined by Nessler's method, diazotization method (low range), and cadmium reduction method, respectively.Before the analyses of NO 2 − -N and NO 3 − -N, the water sample was filtered through a 0.7 m glass microfibre filter (Whatman GF/F).Org-N was determined by the Macro-Kjeldahl Method where ammonia was removed from the water sample before digestion and distillation.Subsequently, ammonia was analyzed by using Nessler's method.TS was analyzed using the methylene blue method.H 2 S was calculated according to [18] with the following equation: where H 2 S is the unionized hydrogen sulphide, TS is the total sulphide,   is the conditional ionization constant, and [H + ] is the hydrogen ion concentration.
A calibration curve was constructed for each chemical analysis.The blank and standard solutions were treated in the same way as the sample.

Water Quality Index (WQI). Water quality index (WQI)
which combines the six variables of DO, BOD, COD, TSS, AN, and pH was calculated according to the following equation: where SI DO is the subindex for DO (% saturation), SI BOD is the subindex for BOD (mg/L), SI COD is the subindex for COD (mg/L), SI AN is the subindex for AN (mg/L), SI SS is the subindex for SS (mg/L), and SI pH is the subindex for pH [19].

Statistical Analysis.
Comparison of physicochemical parameters between the stations in the Batang Baram was conducted using one-way ANOVA and Tukey's pairwise comparisons with 5% significance level.The independent samples t-test was used to compare the physicochemical parameters between the main river and tributary stations.Pearson's correlation analysis was performed to determine the relationship among all the parameters.Cluster analysis (CA) was used to investigate the grouping of the sampling stations by using the physicochemical parameters collected in the river.-score standardization of the variables and Ward's method using Euclidean distances as a measure of similarity were used.The cluster was considered statistically significant at a linkage distance of <60% and the number of clusters was decided by the practicality of the outputs [12].Principal component analysis (PCA) was conducted to characterize the loadings of all physicochemical parameters for each of the PCs obtained having eigenvectors higher than one (Kaiser criterion).The component has significant loading on a variable when the loading is greater than 0.4 [20].The data were square-rooted and standardized prior to the analysis.The quality of data for PCA was confirmed with Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy test and Bartlett's test of sphericity.All the statistical analyses were carried out by using the Statistical Software for Social Sciences (SPSS Version 22, SPSS Inc., 1995).

The Physicochemical Characteristics of the Batang Baram and Its
Tributaries.Figures 2 and 3 show the mean values of the physicochemical parameters of the Batang Baram and its tributaries from upstream to downstream regions.During the

Main river Tributary
(f)  sampling, total discharge of the Batang Baram ranged from 126.3 m 3 /s to 2711.8 m 3 /s and from 0.5 m 3 /s to 133.6 m 3 /s in main river and tributaries, respectively.Figure 2 illustrates that total discharge of main river showed an increasing trend towards downstream regions whereas the highest total discharge in tributaries was observed at station 8 followed by station 16.Mean velocity of the river was relatively consistent in main river with a mean value of 1.2 m/s.High mean velocity (>1 m/s) was also observed in some of the tributaries located upstream but most of the tributaries were slow flowing (≈0.2 m/s).Mean depth of the Batang Baram ranged from 0.9 m to 16.5 m and from 0.2 m to 6.1 m in main river and tributaries, respectively.Both main river and tributaries were relatively deeper downstream compared to upstream.The results of ANOVA showed that all of the parameters demonstrated significant variations (p value ≤ 0.05) from one sampling station to another.The physicochemical parameters showed different distribution patterns along the main river.The turbidity, TSS, and H 2 S values increased significantly (p value ≤ 0.05) towards downstream with the highest values of turbidity (468.8 ± 45.4 NTU) and TSS (320.0 ± 19.3 mg/L) which were both observed at station 20 while the highest value of H 2 S (0.83 ± 0.01 mg/L) was observed at station 22.The high turbidity and TSS downstream indicate the accumulation of sediment in the river.Reference [21] reported that a spit was formed in the Baram River mouth and continued to expand due to the erosion associated with deforestation and land use changes in the upstream region.The similar distribution pattern and significant positive correlation (p value ≤ 0.05) between H 2 S, turbidity, and TSS (Table 2) indicated that H 2 S was associated with suspended solids in the river.

Main river Tributary (i)
On the other hand, the conductivity, BOD 5 , TP, NO 3 − -N, and Org-N showed higher values at the upper part of the river and decreased significantly (p value ≤ 0.05) towards downstream region.In contrary, [22] demonstrated that TP, TN, and NH 3 -N concentrations tend to increase from upstream to downstream regions in the Qiantang River, East China.In the present study, the highest conductivity value was observed at station 1 (50.0 S/cm) and steadily decreased to 3.5 S/cm at station 27.The highest values of BOD 5 (5.7 ± 0.2 mg/L) and Org-N (2.74 ± 0.01 mg/L) were observed at stations 1 and 4, respectively, while the highest values of TP (2.2 ± 0.1 mg/L) and NO 3 − -N (0.07 ± 0.01 mg/L) were observed at station 6.The conductivity value (82 S/cm-133 S/cm) in the uppermost part of the Baram River basin reported by [9] was relatively higher than the present study which agrees with the present result that conductivity value was higher in the upper part of the river.However, the author also reported the concentrations of the BOD 5 (0.7 mg/L to 2.0 mg/L) and NO 3 − -N (0.01 mg/L-0.02mg/L) which were lower than the present study.
Significantly higher COD value (p value ≤ 0.05) was observed in the middle section of the river (110.1 mg/L-181.8mg/L) whereas NO 2 − -N (0.001 mg/L-0.002mg/L) and TAN (0.12 mg/L-0.33mg/L) values were significantly lower (p value ≤ 0.05) there.Significantly higher (p value ≤ 0.05) TAN was observed at stations 21 (1.57± 0.07 mg/L) and 22 (1.49± 0.20 mg/L) while significantly higher (p value ≤ 0.05) NO 2 − -N was observed at station 20 (0.055 ± 0.001 mg/L).Similar to BOD 5 and NO 3 − -N, the NH 3 -N concentration in the uppermost part of the Baram River basin which ranged from 0.7 mg/L to 2.0 mg/L [9] was lower than the present study.The author attributed the high ammonia concentration in his study to the sewage discharge from the longhouse and animal waste.The higher nutrients concentration in the present study indicated the deterioration of water quality over time due to the increase in population and land development in the area.
Table 3 shows that the river temperature, pH, conductivity, transparency, chl a, and NO 2 − -N were significantly higher (p value ≤ 0.05) in tributaries than in the main river.The high water temperature in tributaries particularly at stations 12, 13, and 16 (>29 ∘ C) indicated that direct solar radiation due to the forest canopy exposure after logging had increased the river temperature in those tributaries [6].The Baram River basin contained high dissolved ions which gave the high conductivity values in the river [9].Besides, significant positive correlation (p value ≤ 0.05) between temperature and conductivity indicated that the high temperature in   Positive value of mean difference indicates parameter studied was higher in the main river of Batang Baram whereas negative value indicates parameter studied was higher in the tributary.The significant difference at  value ≤ 0.05 was indicated in bold.
tributaries increased the ionic mobility and solubility of minerals which is reflected in high conductivity in tributaries.Also, the high photosynthesis rate as indicated by the high chl  in tributaries had increased the pH values in tributaries.
On the other hand, the main river contained significantly higher (p value ≤ 0.05) DO, turbidity, TSS, BOD 5 , TP, and H 2 S (Table 3).Most of these parameters were significantly and positively correlated (p value ≤ 0.05) with total discharge and mean velocity of the river (Table 2).Hence, we can assume that the fast flowing main river had increased the DO content due to more rapid aeration and had introduced more pollutants into the river via surface runoff.Nevertheless, tributaries of the Batang Baram were also well aerated as all of the stations were recorded with DO content of more than 5 mg/L and DOsat more than 80%.Table 4 shows that most of the sampling stations were classified as Class III and categorized as "slightly polluted" according to the water quality index (WQI).Among the 30 stations along the Batang Baram and its tributaries, only two stations which were located at tributaries of Sungai Kesseh and Sungai Nakan were categorized as "clean."The pH and DO were classified as Class I and/or Class II indicating good condition whereas the COD was the worst parameter where most of the stations were classified as Classes III, IV, and/or V.The river also possesses pollution risk by suspended solids as TSS was classified as Class III and/or Class IV at most of the stations and was classified as Class V at stations 19 and 20.The results revealed a deteriorating water quality of the Batang Baram and its tributaries when compared to the uppermost part of the Baram River basin reported by [9] where the river was grouped as a Class II river.

Cluster Analysis (CA).
Cluster analysis was used to detect similarities among the sampling stations in the study area.The dendogram shows that sampling stations in the present study can be grouped into four significant clusters as illustrated by Figure 4.The clustering pattern shows that physicochemical characteristics of the Batang Baram changed ) also showed similarity and grouped together as cluster 3.These two clusters show that main river and tributaries of the Batang Baram which were located downstream shared no similarity.Finally, cluster 4 consists of stations that were located upstream of the river including stations that were located at the main river and tributaries (station 1 to station 10).This analysis suggests that a reduced number of sampling stations in each cluster may serve as a rapid assessment of the water quality of the Batang Baram and leads to a more costeffective monitoring study in the future.

Principal Component Analysis (PCA).
The PCA was used to explore the most important factors determining the spatial variations in physicochemical parameters of the Batang Baram.A total of six principal components (PCs) were obtained with eigenvalues more than one which accounted for around 83.6% of the total variance in the 20 physicochemical parameters of the Batang Baram (Table 5).The first component (PC1), accounting for 30.0% of the total variance in the data sets of the river water, has significant positive loadings on turbidity, TSS, and H 2 S and negative loadings on conductivity and transparency.These factors imply that soil erosion occurred in the present study area, and a high loading of turbidity and H 2 S is associated with the presence of suspended solids [23].Similarly, strong positive loadings on turbidity and suspended solids were also observed in a Mekong Delta area of Vietnam [11] which is a result of soil erosion from disturbed land.The Sarawak forest is subjected to high timber harvesting pressure rendering sedimentation problem in its forest streams [1,6,[24][25][26].The present study shows that the Batang Baram in Sarawak state is no exception.Logging activities in the surrounding area have caused sedimentation and increased suspended solids level in the river.The PC1 has the largest proportion of the total variance indicating that logging activities are the major source of river water contamination in the Batang Baram.The PC2 accounting for 18.6% of the total variance has significant positive loadings on mean velocity, BOD 5 , TP, NO 3 − -N, and Org-N and negative loadings on pH and DOsat.These factors indicate an inflow of effluent from longhouses and residential area largely consisting of organic pollutants; and a negative loading of pH and DOsat is attributed to the process of decomposition of the organic matter.Similarly, the analysis of PCA was applied in the Qiantang River which indicated that TN, NO 3 − , NH 3 -N, and TP were the dominant pollution factors in the river [22].The authors attributed the pollutions to the domestic sewage, discharge of poultry and animal feces, and fertilizer that were flushed into the river.Also, an "organic" factor that positively loaded with COD, BOD 5 , TON, TP, and PO 4 3− was reported in a main river system in northern Greece which represented the influence of municipal and industrial effluents [16].The PC3 accounting for 13.6% of total variance has significant positive loadings on total discharge, mean velocity, mean depth, DOsat, and DO and negative loadings on conductivity and COD suggesting a dilution of chemically oxidizable material in the river associated with high volume of river water and high dissolved oxygen level.The high COD/BOD 5 ratio in the Batang Baram indicates a large nonbiodegradable fraction of organic matter in the river.
The PC4 (8.7% of the total variance) has significant positive loadings on transparency, TAN, NO 3 − -N, and H 2 S and negative loadings on temperature, pH, and conductivity.In an anaerobic condition, the high loading of organic matter in river can lead to the formation of ammonia and organic acids which coupled with the production of hydrogen sulphide and carbon dioxide during decomposition [27,28] can cause acidification of water.By employing the PCA for the data interpretation, [10] also revealed that parameters related to organic pollutants and temperature were the most important parameters contributing to water quality variation in the Sava River, Croatia.The PC5 (6.8% of the total variance) is significantly and negatively loaded on COD but positively loaded on NO 2 − -N.Again, the high loading of organic matter in the river likely led to the build-up of NO 2 − -N in the water.Similar to PC2, both PC4 and PC5 can be explained as influences from domestic discharges which contained high nutrients and organic matter.As PC2 has a larger proportion of the total variance than PC4 and PC5, we can assume that organic pollution in the Batang Baram is more severe than the inorganic pollution.Reference [29] also reported that organic

Figure 1 :
Figure 1: The study area in Sarawak state and location of the 30 sampling stations along the Batang Baram and its tributaries in the present study.

Figure 4 :
Figure 4: Clustering of the 30 sampling stations along the Batang Baram and its tributaries.

Table 3 :
Mean difference of in situ and ex situ water quality parameters between the main river of the Batang Baram and its tributaries.

Table 4 :
Classification of water quality of the Batang Baram from upstream to downstream regions according to WQI.

Table 5 :
Loadings of the physicochemical parameters on the first six varimax-rotated PCs (eigenvalue > 1) along the Batang Baram and its tributaries.
a Rotation converged in 11 iterations.