Evaluation of Water Quality of Kaveri River in Tiruchirappalli District , Tamil Nadu by Principal Component Analysis

Principal component analysis is a unique technique for reducing the dimensionality of the data. In this study, ten water quality parameters of the river Kaveri observed at five different stations of Tiruchirappalli for six years were collected and subjected to principal component analysis. A computational program was prepared in order to process and understand the data as a cluster. At first necessary data for compiling the program were listed and then fed to the program. Then the outputs were analyzed and possible linear and non-linear relationships between the water quality parameters and the timeline. It is understood that biological oxygen demand and fecal coli had a linear relationship. Further, the results suggested for group of factors that influence the water quality in a particular year.

Among the water bodies, rivers are the most vulnerable to pollution due to erosion and dissolution of minerals from overlying rocks as well as anthropogenic activities such as discharge of industrial effluents and municipal sewage into rivers.Pollution of river water causes alteration in species composition and decreased health of both aquatic and human communities.As a common practice, the water quality of rivers is monitored periodically by measuring multiple parameters at different monitoring stations.This measurement results in large number of complex physico-chemical and biological parameters which should be further assessed to understand the water quality.
Since rivers are the most important resources for human consumption, it is important to have reliable and easily understandable information on the characteristics of water quality and its trend for effective water management.Several preceding researches applied different analysis techniques such as water quality indices (WQI), structural dynamic models, fuzzy logical inference and others for evaluation of water quality.However, these methods are not applicable for large scale data and long term monitoring of rivers.Recently multivariate statistical analysis techniques such as Principal Component Analysis (PCA) have been successfully employed in a number of water quality analyses.All these studies showed that PCA can interpret large scale complex data with correlating the different water quality parameters.PCA is successfully applied in quasi-harmonic analysis in protein research, molecular dynamics, prediction of suitable corrosion inhibitors, and for understanding the metabolomic data obtained from NMR and others [1][2][3][4][5][6][7][8][9] .
In this study, the water quality parameter database of river Kaveri observed at 5 different stations were subjected to principal components analysis with a view to extract information about the similarities and dissimilarities among the sample collected at different stations.In addition, the water quality parameters that influence more on the samples were also evaluated.

EXPERiMENTAL
The river Kaveri is one of the major resources of potable water in Tamil Nadu, India.This river is polluted by various types of anthropogenic activities.Therefore, it is important to monitor and analyse the water quality at different points through which a river passes through.In this study, nine water quality parameters, pH, dissolved oxygen (DO), biological oxygen demand (BOD), chloride, sulphate, nitrate, total hardness, fecal coli, and total coli were observed at 5 different stations.The water quality parameters were determined based on the raw data that was observed by Government of Tamil Nadu in the district of Tiruchirappalli alongside the Kaveri during 2004-2011 10 .However, the raw data for 2007-2008 is missing.Further some data are incomplete for few years which may be due to technical failure in the measurement.In such cases, the values corresponding to previous years were taken into account in order to avoid difficulty in evaluating the principal component.Principal component analysis (PCA) was performed using Scilab software.A dedicated program was prepared to feed the data and outputs were obtained as figures.

RESuLTS AND DiSCuSSioN
Generally, the quality of water is classified between A to E. For the purpose of computation, the authors replaced these alphabets with numbers 1 to 5. The water quality parameters observed at five different stations and their ranking are given in Tables 1-5.It should be noted that the water quality parameters along with the ranking are considered as variables (V1 to V10) and the years as constants.Using these parameters a program is prepared and executed using Scilab software.The output figures are shown in Figures 1-5. Figure 1a shows that the angles between V1 and V4, V2 and V5, V3 and V5/V6, V4 and V6, V5/ V6/V7 and V9/V7 were about 90 o .This is attributed to the non-linear relationship between the variables.The variables that are close to each other can be grouped as (V1, V2, V10), (V3, V9), (V5, V6), (V4, V7).According to PCA, these groups of variables have direct relationship between them.For instance, BOD (V3) and total coli count (V9) are found to have linear relationship.In practice, BOD is a direct measure for growth of any microbes.Further, the angles between V9 and V4/V7 and V8 and V10 are close to 180 o , which is attributed to linear relationship between them.Apparently, the concentration of chloride (V4) in the water could affect the growth of microbes (V9).In the case of the samples collected at upper stream Kaveri in Tiruchirappalli, the data were unique and different.For instance, V1 is different from any other parameters, which suggests that it has less influence over the quality of the water in the particular year (Figure 2a).Meanwhile, it is interesting to note that except V1 and V10 all other variables can be grouped into two.On one hand, V2, V3, V8, and V9 are close to each other.On the other hand V4, V5, V6, and V7 are close to each other.As discussed above, the closeness of the variables suggest for direct or linear relationship between these parameters in determining the quality of water.It is interesting to note that the quality of water during Similar to the samples collected at patharakaliamman koil and Tiruchirappalli upper stream, the data of other samples collected at grand anicut, Tiruchirappalli downstream, and Kollidam also be discussed.It is interesting to note that in almost all of the samples collected at different stations, the variables V3 and V9 are in same quarter with more or less similar values.This strongly suggests the linear relationship between BOD (V3) and fecal coli (V9).It should be noted that in all the samples, the data collected in the year 2004-2005 and 2010-2011 are in opposite trend.This is due to the increase of the parameters such as chloride, sulphate, nitrate and fecal coli which subsequently led to the decrease of the water quality.

CoNCLuSioN
PCA is a very useful technique for reducing the dimensionality of the data for subsequent Further the relationship between BOD and fecal coli played major role in the quality of the water.Similar to this study, several variables and constants can be considered and taken into account for creation of database which will be very useful the researchers for better understanding of the water quality.

Fig. 1 :Fig. 2 :
Fig. 1: Scores and relationship plots of (a) water quality parameters (variables, v) and (b) over all quality of water during different years (contstants) of water quality observed at pathrakaliamman koil

Fig. 3 :Fig. 4 :
Fig. 3: Scores and relationship plots of (a) water quality parameters (variables, v) and (b) over all quality of water during different years (contstants) of water quality observed at grand anicut

Fig. 5 :
Fig. 5: Scores and relationship plots of (a) water quality parameters (variables, v) and (b) over all quality of water during different years (contstants) of water quality observed at kollidam