The Pearson correlation coefficient is a measure of the strength and direction of the linear relationship between two random variables. For two random variables X and Y, the correlation is defined as the covariance of X and Y divided by the product of the standard deviation of X and standard deviation of Y:

If we have n observations on two variables x and y, we can calculate the sample correlation:

The correlation coefficient must lie between -1 and 1: -1 ≤ r ≤ 1.
Here, sx is the sample standard deviation of X and sy is the sample standard deviation of Y.
Example.
The following table shows the calculation of the sample correlation coefficient, r:
| i | xi | yi | xi - |
yi - |
(3) x (4) |
|---|---|---|---|---|---|
| column | 1 | 2 | 3 | 4 | 5 |
| 1 | 1 | 2 | -3.6 | -4.6 | 16.56 |
| 2 | 2 | 5 | -2.6 | -1.6 | 4.16 |
| 3 | 6 | 3 | 1.4 | -3.6 | -5.04 |
| 4 | 5 | 10 | 0.4 | 3.4 | 1.36 |
| 5 | 9 | 13 | 4.4 | 6.4 | 28.16 |
| Total | 23 | 33 | 45.2 |
Here,
= 23/5 = 4.6 and
= 6.6. The standard deviations are sx = 1.6 and sy= 2.36. The sample correlation is then (1/4)x(45.2)/((2.36)(1.60)) = 0.967.
Properties
- The correlation coefficient always lies in the interval [-1,1]
- A correlation of zero means that there is no linear relationship between the two variables
- A correlation of 1 or -1 means that one variable can be written as a linear function of the other variable, Y = αX where α is some constant
- A strong correlation between two variables should not be interpreted as a causal relationship between the two variables. For example, the correlation between daily snowfall and temperature in Iowa is highly negatively correlated. However, a drop in the temperature does not cause snow to fall.
Anscombe's quartet
The following image displays four datasets called Anscombe's quartet. In each case, the mean and standard deviation of each variable is the same, and the correlation between the two random variables is the same. However, the data look very different when graphed. The lesson: look at your data before you do your analysis.
This image is from the Wikipedia Commons.
