This is an illustration of the method known as Principal Component Analysis (PCA). You are encouraged to play around with the parameters, particularly sample size \(N\) and correlation \(\rho\), to see what effect any changes might have on the outcome.

As our data we draw a sample from the two-dimensional normal distribution with mean vector \(\mu\)

```
mu <- c(2,4)
mu
```

`## [1] 2 4`

and covariance matrix \(\sigma\)

```
rho = 0.8
sigma <- matrix(c(1,rho, rho,1), nrow=2, ncol=2)
rho
```

`## [1] 0.8`

`sigma`

```
## [,1] [,2]
## [1,] 1.0 0.8
## [2,] 0.8 1.0
```

where \(\rho\) determins the correlation between the two components. We take \(N\) copies of such two-dimensional random variables \(X=(X_1, X_2)\); the following figure shows a scatterplot of such a sample.

```
N = 10000
X <- mvrnorm(N,mu,sigma)
# Scatter plot of X=(X_1,X_2)
plot(X[,1], X[,2], xlab="", ylab="", main="Correlation > 0")
```

Here \(\rho > 0\) and we observe an ellipsodial shape with main axes not parallel to the coordinate axes. To compare we also plot a sample where \(\rho = 0\)

```
rho = 0
sigma2 <- matrix(c(1,rho, rho,1), nrow=2, ncol=2)
X2 <- mvrnorm(N,mu,sigma2)
plot(X2[,1], X2[,2], xlab="", ylab="", main="No correlation")
```

We now extract the principal components associated with our two data sets \(\mathbf{X}_1\) (correlation >0) and \(\mathbf{X}_2\) (no correlation). First, we estimate mean vector (same for both) and covariance matrices:

```
muEst <- colMeans(X)
covEst <- cov(X)
covEst2 <- cov(X2)
# Display estimates
muEst
```

`## [1] 2.014046 4.009036`

`covEst`

```
## [,1] [,2]
## [1,] 0.9947950 0.7916366
## [2,] 0.7916366 0.9862532
```

`covEst2`

```
## [,1] [,2]
## [1,] 1.00113238 0.01787527
## [2,] 0.01787527 1.02141634
```

Second, we find the eigenvectors associated with the two covariance matrices.

```
# Find eigenvectors & eigenvalues
# Columns are eigenvectors
eigen <- eigen(covEst)
# Define o1 and o2 as orthogonal vectors
o1 <- eigen$vectors[,1]
o2 <- eigen$vectors[,2]
eigen2 <- eigen(covEst2)
# Define o1 and o2 as orthogonal vectors
o1.nocorr <- eigen2$vectors[,1]
o2.nocorr <- eigen2$vectors[,2]
```

For the case where \(\rho >0\) we have

`o1`

`## [1] -0.7090116 -0.7051968`

`o2`

`## [1] 0.7051968 -0.7090116`

and for \(\rho = 0\) the eigenvectors are

`o1.nocorr`

`## [1] 0.5032500 0.8641409`

`o2.nocorr`

`## [1] -0.8641409 0.5032500`

The next two figures show the eigenvectors for the two cases: