This is an illustration of the method known as Principal Component Analysis (PCA). You are encouraged to play around with the parameters, particularly sample size \(N\) and correlation \(\rho\), to see what effect any changes might have on the outcome.

As our data we draw a sample from the two-dimensional normal distribution with mean vector \(\mu\)

mu <- c(2,4)
mu
## [1] 2 4

and covariance matrix \(\sigma\)

rho = 0.8
sigma <- matrix(c(1,rho, rho,1), nrow=2, ncol=2)
rho
## [1] 0.8
sigma
##      [,1] [,2]
## [1,]  1.0  0.8
## [2,]  0.8  1.0

where \(\rho\) determins the correlation between the two components. We take \(N\) copies of such two-dimensional random variables \(X=(X_1, X_2)\); the following figure shows a scatterplot of such a sample.

N = 10000
X <- mvrnorm(N,mu,sigma)
# Scatter plot of X=(X_1,X_2)
plot(X[,1], X[,2], xlab="", ylab="", main="Correlation > 0")

Here \(\rho > 0\) and we observe an ellipsodial shape with main axes not parallel to the coordinate axes. To compare we also plot a sample where \(\rho = 0\)

rho = 0
sigma2 <- matrix(c(1,rho, rho,1), nrow=2, ncol=2)
X2 <- mvrnorm(N,mu,sigma2)
plot(X2[,1], X2[,2], xlab="", ylab="", main="No correlation")

PCA

We now extract the principal components associated with our two data sets \(\mathbf{X}_1\) (correlation >0) and \(\mathbf{X}_2\) (no correlation). First, we estimate mean vector (same for both) and covariance matrices:

muEst <- colMeans(X)
covEst <- cov(X)
covEst2 <- cov(X2)
# Display estimates
muEst
## [1] 2.014046 4.009036
covEst
##           [,1]      [,2]
## [1,] 0.9947950 0.7916366
## [2,] 0.7916366 0.9862532
covEst2
##            [,1]       [,2]
## [1,] 1.00113238 0.01787527
## [2,] 0.01787527 1.02141634

Second, we find the eigenvectors associated with the two covariance matrices.

# Find eigenvectors & eigenvalues
# Columns are eigenvectors
eigen <- eigen(covEst)
# Define o1 and o2 as orthogonal vectors
o1 <- eigen$vectors[,1]
o2 <- eigen$vectors[,2]
eigen2 <- eigen(covEst2)
# Define o1 and o2 as orthogonal vectors
o1.nocorr <- eigen2$vectors[,1]
o2.nocorr <- eigen2$vectors[,2]

For the case where \(\rho >0\) we have

o1
## [1] -0.7090116 -0.7051968
o2
## [1]  0.7051968 -0.7090116

and for \(\rho = 0\) the eigenvectors are

o1.nocorr
## [1] 0.5032500 0.8641409
o2.nocorr
## [1] -0.8641409  0.5032500

The next two figures show the eigenvectors for the two cases:

We now plot the original data and the transformed data - obtained by centering the data around 0 and then multiplying with the orthogonal matrices - in the same scatterplot.

Comparing the two we clearly see the effect the transformation has (or not) on the shape of the original data. Lastly, we can compute the ratio between the individal eigenvalues and the sum of the eigenvalues for both cases. For \(\rho >0\),

eigen$values/sum(eigen$values)
## [1] 0.8996108 0.1003892

and for \(\rho = 0\).

eigen2$values/sum(eigen2$values)
## [1] 0.5101614 0.4898386

This further illustrates how in the first case the PCA is able to determine directions of importance, whereas in the second case the analysis does not provide any new, useful information.